Do not torn down the system when getting invalid status from a TPM chip.
This can happen when panic-on-warn is used.
In addition, print out the value of TPM_STS.x instead of "invalid
status". In order to get the earlier benefits for forensics, also call
dump_stack().
Link: https://lore.kernel.org/keyrings/YKzlTR1AzUigShtZ@kroah.com/
Fixes: 55707d531af6 ("tpm_tis: Add a check for invalid status")
Cc: stable(a)vger.kernel.org
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Greg KH <greg(a)kroah.com>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v2:
Dump also stack only once.
drivers/char/tpm/tpm_tis_core.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..ce410f19eff2 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -188,21 +188,33 @@ static int request_locality(struct tpm_chip *chip, int l)
static u8 tpm_tis_status(struct tpm_chip *chip)
{
struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
- int rc;
+ static unsigned long klog_once;
u8 status;
+ int rc;
rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
if (rc < 0)
return 0;
if (unlikely((status & TPM_STS_READ_ZERO) != 0)) {
- /*
- * If this trips, the chances are the read is
- * returning 0xff because the locality hasn't been
- * acquired. Usually because tpm_try_get_ops() hasn't
- * been called before doing a TPM operation.
- */
- WARN_ONCE(1, "TPM returned invalid status\n");
+ if (!test_and_set_bit(BIT(0), &klog_once)) {
+ /*
+ * If this trips, the chances are the read is
+ * returning 0xff because the locality hasn't been
+ * acquired. Usually because tpm_try_get_ops() hasn't
+ * been called before doing a TPM operation.
+ */
+ dev_err(&chip->dev, "invalid TPM_STS.x 0x%02x, dumping stack for forensics\n",
+ status);
+
+ /*
+ * Dump stack for forensics, as invalid TPM_STS.x could be
+ * potentially triggered by impaired tpm_try_get_ops() or
+ * tpm_find_get_ops().
+ */
+ dump_stack();
+ }
+
return 0;
}
--
2.31.1
Do not torn down the system when getting invalid status from a TPM chip.
This can happen when panic-on-warn is used.
In addition, print out the value of TPM_STS.x instead of "invalid
status". In order to get the earlier benefits for forensics, also call
dump_stack().
Link: https://lore.kernel.org/keyrings/YKzlTR1AzUigShtZ@kroah.com/
Fixes: 55707d531af6 ("tpm_tis: Add a check for invalid status")
Cc: stable(a)vger.kernel.org
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Greg KH <greg(a)kroah.com>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm_tis_core.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..514a481829c9 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -202,7 +202,16 @@ static u8 tpm_tis_status(struct tpm_chip *chip)
* acquired. Usually because tpm_try_get_ops() hasn't
* been called before doing a TPM operation.
*/
- WARN_ONCE(1, "TPM returned invalid status\n");
+ dev_err_once(&chip->dev, "invalid TPM_STS.x 0x%02x, dumping stack for forensics\n",
+ status);
+
+ /*
+ * Dump stack for forensics, as invalid TPM_STS.x could be
+ * potentially triggered by impaired tpm_try_get_ops() or
+ * tpm_find_get_ops().
+ */
+ dump_stack();
+
return 0;
}
--
2.31.1
The userfaultfd hugetlb tests detect a resv_huge_pages underflow. This
happens when hugetlb_mcopy_atomic_pte() is called with !is_continue on
an index for which we already have a page in the cache. When this
happens, we allocate a second page, double consuming the reservation,
and then fail to insert the page into the cache and return -EEXIST.
To fix this, we first if there exists a page in the cache which already
consumed the reservation, and return -EEXIST immediately if so.
There is still a rare condition where we fail to copy the page contents
AND race with a call for hugetlb_no_page() for this index and again we
will underflow resv_huge_pages. That is fixed in a more complicated
patch not targeted for -stable.
Test:
Hacked the code locally such that resv_huge_pages underflows produce
a warning, then:
./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10
2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success
./tools/testing/selftests/vm/userfaultfd hugetlb 10
2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success
Both tests succeed and produce no warnings. After the
test runs number of free/resv hugepages is correct.
Signed-off-by: Mina Almasry <almasrymina(a)google.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: linux-mm(a)kvack.org
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: linux-mm(a)kvack.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
mm/hugetlb.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ead5d12e0604..76e2a6efc165 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4925,10 +4925,20 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
if (!page)
goto out;
} else if (!*pagep) {
- ret = -ENOMEM;
+ /* If a page already exists, then it's UFFDIO_COPY for
+ * a non-missing case. Return -EEXIST.
+ */
+ if (vm_shared &&
+ hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) {
+ ret = -EEXIST;
+ goto out;
+ }
+
page = alloc_huge_page(dst_vma, dst_addr, 0);
- if (IS_ERR(page))
+ if (IS_ERR(page)) {
+ ret = -ENOMEM;
goto out;
+ }
ret = copy_huge_page_from_user(page,
(const void __user *) src_addr,
--
2.32.0.rc0.204.g9fa02ecfa5-goog
This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7. Ever
since that commit, we've been having issues where a hang in one client
can propagate to another. In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.
Signed-off-by: Jason Ekstrand <jason.ekstrand(a)intel.com>
Reported-by: Marcin Slusarz <marcin.slusarz(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand(a)intel.com>
Cc: Marcin Slusarz <marcin.slusarz(a)intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Signed-off-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
---
drivers/gpu/drm/i915/i915_request.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bbe..b796197c07722 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
do {
fence = *child++;
- if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
- i915_sw_fence_set_error_once(&rq->submit, fence->error);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
continue;
- }
if (fence->context == rq->fence.context)
continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
do {
fence = *child++;
- if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
- i915_sw_fence_set_error_once(&rq->submit, fence->error);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
continue;
- }
/*
* Requests on the same timeline are explicitly ordered, along
--
2.31.1
From: Chris Wilson <chris(a)chris-wilson.co.uk>
The first tracepoint for a request is trace_dma_fence_init called before
we have associated the request with a device. The tracepoint uses
fence->ops->get_driver_name() as a pretty name, and as we try to report
the device name this oopses as it is then NULL. Support the early
tracepoint by reporting the DRIVER_NAME instead of the actual device
name.
Note that rq->engine remains during the course of request recycling
(SLAB_TYPESAFE_BY_RCU). For the physical engines, the pointer remains
valid, however a virtual engine may be destroyed after the request is
retired. If we process a preempt-to-busy completed request along the
virtual engine, we should make sure we mark the request as no longer
belonging to the virtual engine to remove the dangling pointers from the
tracepoint.
Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Cc: Chintan M Patel <chintan.m.patel(a)intel.com>
Cc: Andi Shyti <andi.shyti(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
---
.../drm/i915/gt/intel_execlists_submission.c | 20 ++++++++++++++-----
drivers/gpu/drm/i915/i915_request.c | 7 ++++++-
2 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de124870af44..75604e927d34 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3249,6 +3249,18 @@ static struct list_head *virtual_queue(struct virtual_engine *ve)
return &ve->base.execlists.default_priolist.requests;
}
+static void
+virtual_submit_completed(struct virtual_engine *ve, struct i915_request *rq)
+{
+ GEM_BUG_ON(!__i915_request_is_complete(rq));
+ GEM_BUG_ON(rq->engine != &ve->base);
+
+ __i915_request_submit(rq);
+
+ /* Remove the dangling pointer to the stale virtual engine */
+ WRITE_ONCE(rq->engine, ve->siblings[0]);
+}
+
static void rcu_virtual_context_destroy(struct work_struct *wrk)
{
struct virtual_engine *ve =
@@ -3265,8 +3277,7 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
old = fetch_and_zero(&ve->request);
if (old) {
- GEM_BUG_ON(!__i915_request_is_complete(old));
- __i915_request_submit(old);
+ virtual_submit_completed(ve, old);
i915_request_put(old);
}
@@ -3538,13 +3549,12 @@ static void virtual_submit_request(struct i915_request *rq)
/* By the time we resubmit a request, it may be completed */
if (__i915_request_is_complete(rq)) {
- __i915_request_submit(rq);
+ virtual_submit_completed(ve, rq);
goto unlock;
}
if (ve->request) { /* background completion from preempt-to-busy */
- GEM_BUG_ON(!__i915_request_is_complete(ve->request));
- __i915_request_submit(ve->request);
+ virtual_submit_completed(ve, ve->request);
i915_request_put(ve->request);
}
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bb..aa124adb1051 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -61,7 +61,12 @@ static struct i915_global_request {
static const char *i915_fence_get_driver_name(struct dma_fence *fence)
{
- return dev_name(to_request(fence)->engine->i915->drm.dev);
+ struct i915_request *rq = to_request(fence);
+
+ if (unlikely(!rq->engine)) /* not yet attached to any device */
+ return DRIVER_NAME;
+
+ return dev_name(rq->engine->i915->drm.dev);
}
static const char *i915_fence_get_timeline_name(struct dma_fence *fence)
--
2.26.3
Hi,
This is version 2 of a backport for v5.10.
Changes since v2:
1. Rebased v3 of v5.4 version, I kept numbering to be consistent.
2. The context in patch 1/3 had to be adjusted.
The first patch specifically fixes bug aftert 2nd resume:
BUG: Bad page state in process dbus-daemon pfn:18b01
page:ffffea000062c040 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 compound_mapcount: -30591
flags: 0xfffffc0078141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)
raw: 000fffffc0078141 dead0000000002d0 dead000000000100 0000000000000000
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
bad because of flags: 0x78141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)
Best regards,
Krzysztof
Vitaly Kuznetsov (3):
x86/kvm: Teardown PV features on boot CPU as well
x86/kvm: Disable kvmclock on all CPUs on shutdown
x86/kvm: Disable all PV features on crash
arch/x86/include/asm/kvm_para.h | 10 +---
arch/x86/kernel/kvm.c | 92 ++++++++++++++++++++++++---------
arch/x86/kernel/kvmclock.c | 26 +---------
3 files changed, 72 insertions(+), 56 deletions(-)
--
2.27.0