This test passes pointers obtained from anon_allocate_area to the
userfaultfd and mremap APIs. This causes a problem if the system
allocator returns tagged pointers because with the tagged address ABI
the kernel rejects tagged addresses passed to these APIs, which would
end up causing the test to fail. To make this test compatible with
such system allocators, stop using the system allocator to allocate
memory in anon_allocate_area, and instead just use mmap.
Co-developed-by: Lokesh Gidra <lokeshgidra(a)google.com>
Signed-off-by: Lokesh Gidra <lokeshgidra(a)google.com>
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Fixes: c47174fc362a ("userfaultfd: selftest")
Cc: <stable(a)vger.kernel.org> # 5.4
Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5…
---
v5:
- rebase to 5.14rc1
N.B. when backporting to stable branches, please use the v4 of
this patch:
https://lore.kernel.org/linux-mm/20210707184313.3697385-3-pcc@google.com/
tools/testing/selftests/vm/userfaultfd.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index e363bdaff59d..2ea438e6b8b1 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -210,8 +210,10 @@ static void anon_release_pages(char *rel_area)
static void anon_allocate_area(void **alloc_area)
{
- if (posix_memalign(alloc_area, page_size, nr_pages * page_size))
- err("posix_memalign() failed");
+ *alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+ if (*alloc_area == MAP_FAILED)
+ err("mmap of anonymous memory failed");
}
static void noop_alias_mapping(__u64 *start, size_t len, unsigned long offset)
--
2.32.0.93.g670b81a890-goog
If a user program uses userfaultfd on ranges of heap memory, it may
end up passing a tagged pointer to the kernel in the range.start
field of the UFFDIO_REGISTER ioctl. This can happen when using an
MTE-capable allocator, or on Android if using the Tagged Pointers
feature for MTE readiness [1].
When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field
of struct uffd_msg. However, from the application's perspective,
the tagged address *is* the memory address, so if the application
is unaware of memory tags, it may get confused by receiving an
address that is, from its point of view, outside of the bounds of the
allocation. We observed this behavior in the kselftest for userfaultfd
[2] but other applications could have the same problem.
Address this by not untagging pointers passed to the userfaultfd
ioctls. Instead, let the system call fail. This will provide an
early indication of problems with tag-unaware userspace code instead
of letting the code get confused later, and is consistent with how
we decided to handle brk/mmap/mremap in commit dcde237319e6 ("mm:
Avoid creating virtual address aliases in brk()/mmap()/mremap()"),
as well as being consistent with the existing tagged address ABI
documentation relating to how ioctl arguments are handled.
The code change is a revert of commit 7d0325749a6c ("userfaultfd:
untag user pointers") plus some fixups to some additional calls to
validate_range that have appeared since then.
[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Andrey Konovalov <andreyknvl(a)gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0…
Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI")
Cc: <stable(a)vger.kernel.org> # 5.4
---
v4:
- document the changes more accurately
- fix new calls to validate_range
Documentation/arm64/tagged-address-abi.rst | 26 +++++++++++++++-------
fs/userfaultfd.c | 26 ++++++++++------------
2 files changed, 30 insertions(+), 22 deletions(-)
diff --git a/Documentation/arm64/tagged-address-abi.rst b/Documentation/arm64/tagged-address-abi.rst
index 459e6b66ff68..0c9120ec58ae 100644
--- a/Documentation/arm64/tagged-address-abi.rst
+++ b/Documentation/arm64/tagged-address-abi.rst
@@ -45,14 +45,24 @@ how the user addresses are used by the kernel:
1. User addresses not accessed by the kernel but used for address space
management (e.g. ``mprotect()``, ``madvise()``). The use of valid
- tagged pointers in this context is allowed with the exception of
- ``brk()``, ``mmap()`` and the ``new_address`` argument to
- ``mremap()`` as these have the potential to alias with existing
- user addresses.
-
- NOTE: This behaviour changed in v5.6 and so some earlier kernels may
- incorrectly accept valid tagged pointers for the ``brk()``,
- ``mmap()`` and ``mremap()`` system calls.
+ tagged pointers in this context is allowed with these exceptions:
+
+ - ``brk()``, ``mmap()`` and the ``new_address`` argument to
+ ``mremap()`` as these have the potential to alias with existing
+ user addresses.
+
+ NOTE: This behaviour changed in v5.6 and so some earlier kernels may
+ incorrectly accept valid tagged pointers for the ``brk()``,
+ ``mmap()`` and ``mremap()`` system calls.
+
+ - The ``range.start``, ``start`` and ``dst`` arguments to the
+ ``UFFDIO_*`` ``ioctl()``s used on a file descriptor obtained from
+ ``userfaultfd()``, as fault addresses subsequently obtained by reading
+ the file descriptor will be untagged, which may otherwise confuse
+ tag-unaware programs.
+
+ NOTE: This behaviour changed in v5.14 and so some earlier kernels may
+ incorrectly accept valid tagged pointers for this system call.
2. User addresses accessed by the kernel (e.g. ``write()``). This ABI
relaxation is disabled by default and the application thread needs to
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index f6e0f0c0d0e5..5c2d806e6ae5 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1236,23 +1236,21 @@ static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx,
}
static __always_inline int validate_range(struct mm_struct *mm,
- __u64 *start, __u64 len)
+ __u64 start, __u64 len)
{
__u64 task_size = mm->task_size;
- *start = untagged_addr(*start);
-
- if (*start & ~PAGE_MASK)
+ if (start & ~PAGE_MASK)
return -EINVAL;
if (len & ~PAGE_MASK)
return -EINVAL;
if (!len)
return -EINVAL;
- if (*start < mmap_min_addr)
+ if (start < mmap_min_addr)
return -EINVAL;
- if (*start >= task_size)
+ if (start >= task_size)
return -EINVAL;
- if (len > task_size - *start)
+ if (len > task_size - start)
return -EINVAL;
return 0;
}
@@ -1316,7 +1314,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
vm_flags |= VM_UFFD_MINOR;
}
- ret = validate_range(mm, &uffdio_register.range.start,
+ ret = validate_range(mm, uffdio_register.range.start,
uffdio_register.range.len);
if (ret)
goto out;
@@ -1522,7 +1520,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister)))
goto out;
- ret = validate_range(mm, &uffdio_unregister.start,
+ ret = validate_range(mm, uffdio_unregister.start,
uffdio_unregister.len);
if (ret)
goto out;
@@ -1671,7 +1669,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx,
if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake)))
goto out;
- ret = validate_range(ctx->mm, &uffdio_wake.start, uffdio_wake.len);
+ ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
if (ret)
goto out;
@@ -1711,7 +1709,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx,
sizeof(uffdio_copy)-sizeof(__s64)))
goto out;
- ret = validate_range(ctx->mm, &uffdio_copy.dst, uffdio_copy.len);
+ ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len);
if (ret)
goto out;
/*
@@ -1768,7 +1766,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx,
sizeof(uffdio_zeropage)-sizeof(__s64)))
goto out;
- ret = validate_range(ctx->mm, &uffdio_zeropage.range.start,
+ ret = validate_range(ctx->mm, uffdio_zeropage.range.start,
uffdio_zeropage.range.len);
if (ret)
goto out;
@@ -1818,7 +1816,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx,
sizeof(struct uffdio_writeprotect)))
return -EFAULT;
- ret = validate_range(ctx->mm, &uffdio_wp.range.start,
+ ret = validate_range(ctx->mm, uffdio_wp.range.start,
uffdio_wp.range.len);
if (ret)
return ret;
@@ -1866,7 +1864,7 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
sizeof(uffdio_continue) - (sizeof(__s64))))
goto out;
- ret = validate_range(ctx->mm, &uffdio_continue.range.start,
+ ret = validate_range(ctx->mm, uffdio_continue.range.start,
uffdio_continue.range.len);
if (ret)
goto out;
--
2.32.0.93.g670b81a890-goog
This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7. Ever
since that commit, we've been having issues where a hang in one client
can propagate to another. In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.
Error propagation along fences sound like a good idea, but as your bug
shows, surprising consequences, since propagating errors across security
boundaries is not a good thing.
What we do have is track the hangs on the ctx, and report information to
userspace using RESET_STATS. That's how arb_robustness works. Also, if my
understanding is still correct, the EIO from execbuf is when your context
is banned (because not recoverable or too many hangs). And in all these
cases it's up to userspace to figure out what is all impacted and should
be reported to the application, that's not on the kernel to guess and
automatically propagate.
What's more, we're also building more features on top of ctx error
reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
userspace fence wait also relies on that mechanism. So it is the path
going forward for reporting gpu hangs and resets to userspace.
So all together that's why I think we should just bury this idea again as
not quite the direction we want to go to, hence why I think the revert is
the right option here.
For backporters: Please note that you _must_ have a backport of
https://lore.kernel.org/dri-devel/20210602164149.391653-2-jason@jlekstrand.…
for otherwise backporting just this patch opens up a security bug.
v2: Augment commit message. Also restore Jason's sob that I
accidentally lost.
v3: Add a note for backporters
Signed-off-by: Jason Ekstrand <jason(a)jlekstrand.net>
Reported-by: Marcin Slusarz <marcin.slusarz(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand(a)intel.com>
Cc: Marcin Slusarz <marcin.slusarz(a)intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Reviewed-by: Jon Bloomfield <jon.bloomfield(a)intel.com>
---
drivers/gpu/drm/i915/i915_request.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 86b4c9f2613d5..09ebea9a0090a 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1399,10 +1399,8 @@ i915_request_await_execution(struct i915_request *rq,
do {
fence = *child++;
- if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
- i915_sw_fence_set_error_once(&rq->submit, fence->error);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
continue;
- }
if (fence->context == rq->fence.context)
continue;
@@ -1499,10 +1497,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
do {
fence = *child++;
- if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
- i915_sw_fence_set_error_once(&rq->submit, fence->error);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
continue;
- }
/*
* Requests on the same timeline are explicitly ordered, along
--
2.31.1
We have been hitting some early ENOSPC issues in production with more
recent kernels, and I tracked it down to us simply not flushing delalloc
as aggressively as we should be. With tracing I was seeing us failing
all tickets with all of the block rsvs at or around 0, with very little
pinned space, but still around 120MiB of outstanding bytes_may_used.
Upon further investigation I saw that we were flushing around 14 pages
per shrink call for delalloc, despite having around 2GiB of delalloc
outstanding.
Consider the example of a 8 way machine, all CPUs trying to create a
file in parallel, which at the time of this commit requires 5 items to
do. Assuming a 16k leaf size, we have 10MiB of total metadata reclaim
size waiting on reservations. Now assume we have 128MiB of delalloc
outstanding. With our current math we would set items to 20, and then
set to_reclaim to 20 * 256k, or 5MiB.
Assuming that we went through this loop all 3 times, for both
FLUSH_DELALLOC and FLUSH_DELALLOC_WAIT, and then did the full loop
twice, we'd only flush 60MiB of the 128MiB delalloc space. This could
leave a fair bit of delalloc reservations still hanging around by the
time we go to ENOSPC out all the remaining tickets.
Fix this two ways. First, change the calculations to be a fraction of
the total delalloc bytes on the system. Prior to this change we were
calculating based on dirty inodes so our math made more sense, now it's
just completely unrelated to what we're actually doing.
Second add a FLUSH_DELALLOC_FULL state, that we hold off until we've
gone through the flush states at least once. This will empty the system
of all delalloc so we're sure to be truly out of space when we start
failing tickets.
I'm tagging stable 5.10 and forward, because this is where we started
using the page stuff heavily again. This affects earlier kernel
versions as well, but would be a pain to backport to them as the
flushing mechanisms aren't the same.
CC: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
---
fs/btrfs/ctree.h | 9 ++++----
fs/btrfs/space-info.c | 40 +++++++++++++++++++++++++-----------
include/trace/events/btrfs.h | 1 +
3 files changed, 34 insertions(+), 16 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index d7ef4d7d2c1a..232ff1a49ca6 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2783,10 +2783,11 @@ enum btrfs_flush_state {
FLUSH_DELAYED_REFS = 4,
FLUSH_DELALLOC = 5,
FLUSH_DELALLOC_WAIT = 6,
- ALLOC_CHUNK = 7,
- ALLOC_CHUNK_FORCE = 8,
- RUN_DELAYED_IPUTS = 9,
- COMMIT_TRANS = 10,
+ FLUSH_DELALLOC_FULL = 7,
+ ALLOC_CHUNK = 8,
+ ALLOC_CHUNK_FORCE = 9,
+ RUN_DELAYED_IPUTS = 10,
+ COMMIT_TRANS = 11,
};
int btrfs_subvolume_reserve_metadata(struct btrfs_root *root,
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index af161eb808a2..1a7e44dd0ba7 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -494,6 +494,11 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
long time_left;
int loops;
+ delalloc_bytes = percpu_counter_sum_positive(&fs_info->delalloc_bytes);
+ ordered_bytes = percpu_counter_sum_positive(&fs_info->ordered_bytes);
+ if (delalloc_bytes == 0 && ordered_bytes == 0)
+ return;
+
/* Calc the number of the pages we need flush for space reservation */
if (to_reclaim == U64_MAX) {
items = U64_MAX;
@@ -501,22 +506,21 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
/*
* to_reclaim is set to however much metadata we need to
* reclaim, but reclaiming that much data doesn't really track
- * exactly, so increase the amount to reclaim by 2x in order to
- * make sure we're flushing enough delalloc to hopefully reclaim
- * some metadata reservations.
+ * exactly. What we really want to do is reclaim full inode's
+ * worth of reservations, however that's not available to us
+ * here. We will take a fraction of the delalloc bytes for our
+ * flushing loops and hope for the best. Delalloc will expand
+ * the amount we write to cover an entire dirty extent, which
+ * will reclaim the metadata reservation for that range. If
+ * it's not enough subsequent flush stages will be more
+ * aggressive.
*/
+ to_reclaim = max(to_reclaim, delalloc_bytes >> 3);
items = calc_reclaim_items_nr(fs_info, to_reclaim) * 2;
- to_reclaim = items * EXTENT_SIZE_PER_ITEM;
}
trans = (struct btrfs_trans_handle *)current->journal_info;
- delalloc_bytes = percpu_counter_sum_positive(
- &fs_info->delalloc_bytes);
- ordered_bytes = percpu_counter_sum_positive(&fs_info->ordered_bytes);
- if (delalloc_bytes == 0 && ordered_bytes == 0)
- return;
-
/*
* If we are doing more ordered than delalloc we need to just wait on
* ordered extents, otherwise we'll waste time trying to flush delalloc
@@ -596,8 +600,11 @@ static void flush_space(struct btrfs_fs_info *fs_info,
break;
case FLUSH_DELALLOC:
case FLUSH_DELALLOC_WAIT:
+ case FLUSH_DELALLOC_FULL:
+ if (state == FLUSH_DELALLOC_FULL)
+ num_bytes = U64_MAX;
shrink_delalloc(fs_info, space_info, num_bytes,
- state == FLUSH_DELALLOC_WAIT, for_preempt);
+ state != FLUSH_DELALLOC, for_preempt);
break;
case FLUSH_DELAYED_REFS_NR:
case FLUSH_DELAYED_REFS:
@@ -907,6 +914,14 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
commit_cycles--;
}
+ /*
+ * We do not want to empty the system of delalloc unless we're
+ * under heavy pressure, so allow one trip through the flushing
+ * logic before we start doing a FLUSH_DELALLOC_FULL.
+ */
+ if (flush_state == FLUSH_DELALLOC_FULL && !commit_cycles)
+ flush_state++;
+
/*
* We don't want to force a chunk allocation until we've tried
* pretty hard to reclaim space. Think of the case where we
@@ -1070,7 +1085,7 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
* so if we now have space to allocate do the force chunk allocation.
*/
static const enum btrfs_flush_state data_flush_states[] = {
- FLUSH_DELALLOC_WAIT,
+ FLUSH_DELALLOC_FULL,
RUN_DELAYED_IPUTS,
COMMIT_TRANS,
ALLOC_CHUNK_FORCE,
@@ -1159,6 +1174,7 @@ static const enum btrfs_flush_state evict_flush_states[] = {
FLUSH_DELAYED_REFS,
FLUSH_DELALLOC,
FLUSH_DELALLOC_WAIT,
+ FLUSH_DELALLOC_FULL,
ALLOC_CHUNK,
COMMIT_TRANS,
};
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index d754c6ec1797..e21744a3cdb6 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -94,6 +94,7 @@ struct btrfs_space_info;
EM( FLUSH_DELAYED_ITEMS, "FLUSH_DELAYED_ITEMS") \
EM( FLUSH_DELALLOC, "FLUSH_DELALLOC") \
EM( FLUSH_DELALLOC_WAIT, "FLUSH_DELALLOC_WAIT") \
+ EM( FLUSH_DELALLOC_FULL, "FLUSH_DELALLOC_FULL") \
EM( FLUSH_DELAYED_REFS_NR, "FLUSH_DELAYED_REFS_NR") \
EM( FLUSH_DELAYED_REFS, "FLUSH_ELAYED_REFS") \
EM( ALLOC_CHUNK, "ALLOC_CHUNK") \
--
2.26.3
We use the async_delalloc_pages mechanism to make sure that we've
completed our async work before trying to continue our delalloc
flushing. The reason for this is we need to see any ordered extents
that were created by our delalloc flushing. However we're waking up
before we do the submit work, which is before we create the ordered
extents. This is a pretty wide race window where we could potentially
think there are no ordered extents and thus exit shrink_delalloc
prematurely. Fix this by waking us up after we've done the work to
create ordered extents.
cc: stable(a)vger.kernel.org
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
---
fs/btrfs/inode.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e6eb20987351..0f42864d5dd4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1290,11 +1290,6 @@ static noinline void async_cow_submit(struct btrfs_work *work)
nr_pages = (async_chunk->end - async_chunk->start + PAGE_SIZE) >>
PAGE_SHIFT;
- /* atomic_sub_return implies a barrier */
- if (atomic_sub_return(nr_pages, &fs_info->async_delalloc_pages) <
- 5 * SZ_1M)
- cond_wake_up_nomb(&fs_info->async_submit_wait);
-
/*
* ->inode could be NULL if async_chunk_start has failed to compress,
* in which case we don't have anything to submit, yet we need to
@@ -1303,6 +1298,11 @@ static noinline void async_cow_submit(struct btrfs_work *work)
*/
if (async_chunk->inode)
submit_compressed_extents(async_chunk);
+
+ /* atomic_sub_return implies a barrier */
+ if (atomic_sub_return(nr_pages, &fs_info->async_delalloc_pages) <
+ 5 * SZ_1M)
+ cond_wake_up_nomb(&fs_info->async_submit_wait);
}
static noinline void async_cow_free(struct btrfs_work *work)
--
2.26.3
hallo
all fine here on an Intel i7-6700 box.
thanks
+++ UPDATE +++
##############
Regarding the above I need to be more accurate:
The kernel compiles, boots and runs without errors -so far -.
BUT: it seems under load it could lead to an dead box.
what I did:
running in one terminal: dmesg -w
and in an second terminal:
make clean && ccache -Czs && time make all -j $(nproc)
leads within a minutes to an complete dead box after make has started !
- no output regarding crashes, etc. in the first terminal running dmesg
- no mouse, no keyboard, no switch via CRTL+ALT+F3 to a console
- the box is dead
- only the power button helps
with kernel 5.13.1 all the above is fine
anyone (too) ?
--
regards
Ronald
The commit 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
fixes an issue of "sub-device is removed where the context entry is cleared
for all aliases". But this commit didn't consider the PASID entry and PASID
table in VT-d scalable mode. This fix increases the coverage of scalable
mode.
Suggested-by: Sanjay Kumar <sanjay.k.kumar(a)intel.com>
Fixes: 8038bdb855331 ("iommu/vt-d: Only clear real DMA device's context entries")
Fixes: 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Cc: stable(a)vger.kernel.org # v5.6+
Cc: Jon Derrick <jonathan.derrick(a)intel.com>
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
---
drivers/iommu/intel/iommu.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 57270290d62b..dd22fc7d5176 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4472,14 +4472,13 @@ static void __dmar_remove_one_dev_info(struct device_domain_info *info)
iommu = info->iommu;
domain = info->domain;
- if (info->dev) {
+ if (info->dev && !dev_is_real_dma_subdevice(info->dev)) {
if (dev_is_pci(info->dev) && sm_supported(iommu))
intel_pasid_tear_down_entry(iommu, info->dev,
PASID_RID2PASID, false);
iommu_disable_dev_iotlb(info);
- if (!dev_is_real_dma_subdevice(info->dev))
- domain_context_clear(info);
+ domain_context_clear(info);
intel_pasid_free_table(info->dev);
}
--
2.25.1
Check the allocation size before toggling kfence_allocation_gate.
This way allocations that can't be served by KFENCE will not result in
waiting for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating
anything.
Suggested-by: Marco Elver <elver(a)google.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: stable(a)vger.kernel.org # 5.12+
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marco Elver <elver(a)google.com>
---
mm/kfence/core.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index d7666ace9d2e4..2623ff401a104 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -733,6 +733,13 @@ void kfence_shutdown_cache(struct kmem_cache *s)
void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
{
+ /*
+ * Perform size check before switching kfence_allocation_gate, so that
+ * we don't disable KFENCE without making an allocation.
+ */
+ if (size > PAGE_SIZE)
+ return NULL;
+
/*
* allocation_gate only needs to become non-zero, so it doesn't make
* sense to continue writing to it and pay the associated contention
@@ -757,9 +764,6 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
if (!READ_ONCE(kfence_enabled))
return NULL;
- if (size > PAGE_SIZE)
- return NULL;
-
return kfence_guarded_alloc(s, size, flags);
}
--
2.32.0.93.g670b81a890-goog
Xiaochen Zou (1):
can: fix a potential UAF access in j1939_session_deactivate(). Both
session and session->priv may be freed in
j1939_session_deactivate_locked(). It leads to potential UAF read
and write in j1939_session_list_unlock(). The free chain is
j1939_session_deactivate_locked()->j1939_session_put()->__j1939_session_release()->j1939_session_destroy().
To fix this bug, I moved j1939_session_put() behind
j1939_session_deactivate_locked(), and guarded it with a check of
active since the session would be freed only if active is true.
net/can/j1939/transport.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--
2.17.1
>From 9c4733d093e05db22eb89825579c83e020c3c1a6 Mon Sep 17 00:00:00 2001
From: Xiaochen Zou <xzou017(a)ucr.edu>
Date: Tue, 13 Jul 2021 13:15:59 -0700
Subject: [PATCH 1/1] can: fix a potential UAF access in
j1939_session_deactivate().
To: greg(a)kroah.com
Cc: stable@vger.kernel.org,netdev@vger.kernel.org,linux-can@vger.kernel.org
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.17.1"
This is a multi-part message in MIME format.
The patch titled
Subject: mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-page_poison=1-init_on_alloc_default_on-interaction.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-page_poison%3D1…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-page_poison%3D1…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Sergei Trofimovich <slyfox(a)gentoo.org>
Subject: mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction
To reproduce the failure we need the following system:
- kernel command: page_poison=1 init_on_free=0 init_on_alloc=0
- kernel config:
* CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
* CONFIG_INIT_ON_FREE_DEFAULT_ON=y
* CONFIG_PAGE_POISONING=y
0000000085629bdd: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0000000022861832: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000000c597f5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
CPU: 11 PID: 15195 Comm: bash Kdump: loaded Tainted: G U O 5.13.1-gentoo-x86_64 #1
Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2801 01/13/2021
Call Trace:
dump_stack+0x64/0x7c
__kernel_unpoison_pages.cold+0x48/0x84
post_alloc_hook+0x60/0xa0
get_page_from_freelist+0xdb8/0x1000
__alloc_pages+0x163/0x2b0
__get_free_pages+0xc/0x30
pgd_alloc+0x2e/0x1a0
? dup_mm+0x37/0x4f0
mm_init+0x185/0x270
dup_mm+0x6b/0x4f0
? __lock_task_sighand+0x35/0x70
copy_process+0x190d/0x1b10
kernel_clone+0xba/0x3b0
__do_sys_clone+0x8f/0xb0
do_syscall_64+0x68/0x80
? do_syscall_64+0x11/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Before the 51cba1eb ("init_on_alloc: Optimize static branches")
init_on_alloc never enabled static branch by default. It could only be
enabed explicitly by init_mem_debugging_and_hardening().
But after the 51cba1eb static branch could already be enabled by default.
There was no code to ever disable it. That caused page_poison=1 /
init_on_free=1 conflict.
This change extends init_mem_debugging_and_hardening() to also disable
static branch disabling.
Link: https://lkml.kernel.org/r/20210712215816.1512739-1-slyfox@gentoo.org
Fixes: 51cba1eb ("init_on_alloc: Optimize static branches")
Signed-off-by: Sergei Trofimovich <slyfox(a)gentoo.org>
Reported-by: <bowsingbetee(a)pm.me>
Reported-by: Mikhail Morfikov
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-fix-page_poison=1-init_on_alloc_default_on-interaction
+++ a/mm/page_alloc.c
@@ -840,18 +840,22 @@ void init_mem_debugging_and_hardening(vo
}
#endif
- if (_init_on_alloc_enabled_early) {
- if (page_poisoning_requested)
+ if (_init_on_alloc_enabled_early ||
+ IS_ENABLED(CONFIG_INIT_ON_ALLOC_DEFAULT_ON)) {
+ if (page_poisoning_requested) {
pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, "
"will take precedence over init_on_alloc\n");
- else
+ static_branch_disable(&init_on_alloc);
+ } else
static_branch_enable(&init_on_alloc);
}
- if (_init_on_free_enabled_early) {
- if (page_poisoning_requested)
+ if (_init_on_free_enabled_early ||
+ IS_ENABLED(CONFIG_INIT_ON_FREE_DEFAULT_ON)) {
+ if (page_poisoning_requested) {
pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, "
"will take precedence over init_on_free\n");
- else
+ static_branch_disable(&init_on_free);
+ } else
static_branch_enable(&init_on_free);
}
_
Patches currently in -mm which might be from slyfox(a)gentoo.org are
mm-page_alloc-fix-page_poison=1-init_on_alloc_default_on-interaction.patch
hallo
all fine here on an Intel i7-6700 box.
thanks
+++ UPDATE +++
##############
Regarding the above I need to be more accurate:
The kernel compiles, boots and runs without errors -so far -.
BUT: it seems under load it could lead to an dead box.
what I did:
running in one terminal: dmesg -w
and in an second terminal:
make clean && ccache -Czs && time make all -j $(nproc)
leads within a minutes to an complete dead box after make has started !
- no output regarding crashes, etc. in the first terminal running dmesg
- no mouse, no keyboard, no switch via CRTL+ALT+F3 to a console
- the box is dead
- only the power button helps
with kernel 5.13.1 all the above is fine
anyone (too) ?
--
regards
Ronald
The patch titled
Subject: mm/hugetlb: fix refs calculation from unaligned @vaddr
has been added to the -mm tree. Its filename is
mm-hugetlb-fix-refs-calculation-from-unaligned-vaddr.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-fix-refs-calculation-f…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-fix-refs-calculation-f…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Joao Martins <joao.m.martins(a)oracle.com>
Subject: mm/hugetlb: fix refs calculation from unaligned @vaddr
commit 82e5d378b0e47 ("mm/hugetlb: refactor subpage recording") refactored
the count of subpages but missed an edge case when @vaddr is not aligned
to PAGE_SIZE e.g. when close to vma->vm_end. It would then errousnly set
@refs to 0 and record_subpages_vmas() wouldn't set the @pages array
element to its value, consequently causing the reported null-deref by
syzbot.
Fix it by aligning down @vaddr by PAGE_SIZE in @refs calculation.
Link: https://lkml.kernel.org/r/20210713152440.28650-1-joao.m.martins@oracle.com
Fixes: 82e5d378b0e47 ("mm/hugetlb: refactor subpage recording")
Reported-by: syzbot+a3fcd59df1b372066f5a(a)syzkaller.appspotmail.com
Signed-off-by: Joao Martins <joao.m.martins(a)oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-refs-calculation-from-unaligned-vaddr
+++ a/mm/hugetlb.c
@@ -5440,8 +5440,9 @@ long follow_hugetlb_page(struct mm_struc
continue;
}
- refs = min3(pages_per_huge_page(h) - pfn_offset,
- (vma->vm_end - vaddr) >> PAGE_SHIFT, remainder);
+ /* vaddr may not be aligned to PAGE_SIZE */
+ refs = min3(pages_per_huge_page(h) - pfn_offset, remainder,
+ (vma->vm_end - ALIGN_DOWN(vaddr, PAGE_SIZE)) >> PAGE_SHIFT);
if (pages || vmas)
record_subpages_vmas(mem_map_offset(page, pfn_offset),
_
Patches currently in -mm which might be from joao.m.martins(a)oracle.com are
mm-hugetlb-fix-refs-calculation-from-unaligned-vaddr.patch
Customers have been reporting that the I/O is radically being
slowed down to HPE virtual USB ILO served DVD images during installation.
Lots of investigation by the Red Hat lab has found that the issue is
because MSI edge interrupts do not work properly for these
ILO USB devices.
We start fast and then drop to polling mode and its unusable.
The issue exists currently upstream on 5.13 as tested by Red Hat,
and reverting the mentioned patch corrects this upstream.
David Jeffery has this explanation:
The problem with the patch turning on MSI appears to be that the ehci
driver (and possibly other usb controller types too) wasn't written to
support edge-triggered interrupts.
The ehci_irq routine appears to be written in such a way that it will
be racy with multiple interrupt source bits.
With a level-triggered interrupt, it gets called another time and cleans
up other interrupt sources.
But with MSI edge, the interrupt state staying high results in no
new interrupt and ehci has to run based on polling.
static irqreturn_t ehci_irq (struct usb_hcd *hcd)
{
...
status = ehci_readl(ehci, &ehci->regs->status);
/* e.g. cardbus physical eject */
if (status == ~(u32) 0) {
ehci_dbg (ehci, "device removed\n");
goto dead;
}
/*
* We don't use STS_FLR, but some controllers don't like it to
* remain on, so mask it out along with the other status bits.
*/
masked_status = status & (INTR_MASK | STS_FLR);
/* Shared IRQ? */
if (!masked_status || unlikely(ehci->rh_state == EHCI_RH_HALTED)) {
spin_unlock_irqrestore(&ehci->lock, flags);
return IRQ_NONE;
}
/* clear (just) interrupts */
ehci_writel(ehci, masked_status, &ehci->regs->status);
...
ehci_irq() reads the interrupt status register and then writes the active
interrupt-related bits back out to ack the interrupt cause.
But with an edge interrupt, this is racy as another source of interrupt
could be raised by ehci between the read and the write reaching the
hardware.
e.g. If STS_IAA was set during the initial read, but some other bit like
STS_INT gets raised by the hardware between the read and the write to the
interrupt status register, the interrupt signal state won't drop.
The interrupt state says high, and since it is now edged triggered with
MSI, no new invocation of the interrupt handler gets triggered.
Suggested-by: David Jeffery <djeffery(a)redhat.com>
Suggested-by: Alexandros Panagiotou <apanagio(a)redhat.com>
Tested-by: Laurence Oberman <loberman(a)redhat.com>
Signed-off-by: Laurence Oberman <loberman(a)redhat.com>
Fixes: 306c54d0edb6ba94d39877524dddebaad7770cf2 PCI: Disable MSI for
Pericom PCIe-USB adapter
---
drivers/usb/core/hcd-pci.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index d630ccc..522a179 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -195,21 +195,20 @@ int usb_hcd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id,
* make sure irq setup is not touched for xhci in generic hcd code
*/
if ((driver->flags & HCD_MASK) < HCD_USB3) {
- retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY | PCI_IRQ_MSI);
- if (retval < 0) {
+ if (!dev->irq) {
dev_err(&dev->dev,
"Found HC with no IRQ. Check BIOS/PCI %s setup!\n",
pci_name(dev));
retval = -ENODEV;
goto disable_pci;
}
- hcd_irq = pci_irq_vector(dev, 0);
+ hcd_irq = dev->irq;
}
hcd = usb_create_hcd(driver, &dev->dev, pci_name(dev));
if (!hcd) {
retval = -ENOMEM;
- goto free_irq_vectors;
+ goto disable_pci;
}
hcd->amd_resume_bug = (usb_hcd_amd_remote_wakeup_quirk(dev) &&
@@ -288,9 +287,6 @@ int usb_hcd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id,
put_hcd:
usb_put_hcd(hcd);
-free_irq_vectors:
- if ((driver->flags & HCD_MASK) < HCD_USB3)
- pci_free_irq_vectors(dev);
disable_pci:
pci_disable_device(dev);
dev_err(&dev->dev, "init %s fail, %d\n", pci_name(dev), retval);
@@ -352,8 +348,6 @@ void usb_hcd_pci_remove(struct pci_dev *dev)
up_read(&companions_rwsem);
}
usb_put_hcd(hcd);
- if ((hcd_driver_flags & HCD_MASK) < HCD_USB3)
- pci_free_irq_vectors(dev);
pci_disable_device(dev);
}
EXPORT_SYMBOL_GPL(usb_hcd_pci_remove);
@@ -465,7 +459,7 @@ static int suspend_common(struct device *dev, bool do_wakeup)
* synchronized here.
*/
if (!hcd->msix_enabled)
- synchronize_irq(pci_irq_vector(pci_dev, 0));
+ synchronize_irq(pci_dev->irq);
/* Downstream ports from this root hub should already be quiesced, so
* there will be no DMA activity. Now we can shut down the upstream
--
1.8.3.1
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
The conversion to ww mutexes failed to address the fence code which
already returns -EDEADLK when we run out of fences. Ww mutexes on
the other hand treat -EDEADLK as an internal errno value indicating
a need to restart the operation due to a deadlock. So now when the
fence code returns -EDEADLK the higher level code erroneously
restarts everything instead of returning the error to userspace
as is expected.
To remedy this let's switch the fence code to use a different errno
value for this. -ENOBUFS seems like a semi-reasonable unique choice.
Apart from igt the only user of this I could find is sna, and even
there all we do is dump the current fence registers from debugfs
into the X server log. So no user visible functionality is affected.
If we really cared about preserving this we could of course convert
back to -EDEADLK higher up, but doesn't seem like that's worth
the hassle here.
Not quite sure which commit specifically broke this, but I'll
just attribute it to the general gem ww mutex work.
Cc: stable(a)vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom(a)intel.com>
Testcase: igt/gem_pread/exhaustion
Testcase: igt/gem_pwrite/basic-exhaustion
Testcase: igt/gem_fenced_exec_thrash/too-many-fences
Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index cac7f3f44642..f8948de72036 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt)
if (intel_has_pending_fb_unpin(ggtt->vm.i915))
return ERR_PTR(-EAGAIN);
- return ERR_PTR(-EDEADLK);
+ return ERR_PTR(-ENOBUFS);
}
int __i915_vma_pin_fence(struct i915_vma *vma)
--
2.31.1
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 949241ad55a9 - Linux 5.13.2-rc1
The results of these automated tests are provided below.
Overall result: FAILED (see details below)
Merge: OK
Compile: OK
Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
One or more kernel tests failed:
ppc64le:
❌ LTP
aarch64:
❌ Boot test
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ Reboot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ i2c: i2cdetect sanity
🚧 ✅ Firmware test suite
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
Host 2:
❌ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ stress: stress-ng
ppc64le:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ LTP
⚡⚡⚡ CIFS Connectathon
⚡⚡⚡ POSIX pjd-fstest suites
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ jvm - jcstress tests
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking cki netfilter test
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ xarray-idr-radixtree-test
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ audit: audit testsuite test
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Reboot test
❌ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
⚡⚡⚡ jvm - jcstress tests
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking cki netfilter test
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ xarray-idr-radixtree-test
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ audit: audit testsuite test
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Reboot test
✅ selinux-policy: serge-testsuite
✅ Storage: swraid mdadm raid_module test
🚧 ❌ Podman system integration test - as root
🚧 ❌ Podman system integration test - as user
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
🚧 ⚡⚡⚡ stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ trace: ftrace/tracer
🚧 ❌ xarray-idr-radixtree-test
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Reboot test
✅ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ CIFS Connectathon
⚡⚡⚡ POSIX pjd-fstest suites
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ jvm - jcstress tests
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking cki netfilter test
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ xarray-idr-radixtree-test
🚧 ⚡⚡⚡ i2c: i2cdetect sanity
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ audit: audit testsuite test
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Check the allocation size before toggling kfence_allocation_gate.
This way allocations that can't be served by KFENCE will not result in
waiting for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating
anything.
Suggested-by: Marco Elver <elver(a)google.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: stable(a)vger.kernel.org # 5.12+
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marco Elver <elver(a)google.com>
---
mm/kfence/core.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 4d21ac44d5d35..33bb20d91bf6a 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -733,6 +733,13 @@ void kfence_shutdown_cache(struct kmem_cache *s)
void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
{
+ /*
+ * Perform size check before switching kfence_allocation_gate, so that
+ * we don't disable KFENCE without making an allocation.
+ */
+ if (size > PAGE_SIZE)
+ return NULL;
+
/*
* allocation_gate only needs to become non-zero, so it doesn't make
* sense to continue writing to it and pay the associated contention
@@ -757,9 +764,6 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
if (!READ_ONCE(kfence_enabled))
return NULL;
- if (size > PAGE_SIZE)
- return NULL;
-
return kfence_guarded_alloc(s, size, flags);
}
--
2.32.0.93.g670b81a890-goog
Affinity change can happen in both interrupt context and non-interrupt
context. The paranoia check for interrupt target cpu is not always true
in non-interrupt context.
For example:
Set affinity for irq to CPU#7
$ echo 80 > /proc/irq/default_smp_affinity
Open serial port on different CPU.
$taskset -c 4 agetty -L 115200 ttyS1
Will triger the following warning.
WARNING: CPU: 4 PID: 765 at arch/x86/kernel/apic/msi.c:75 msi_set_affinity+0x16f/0x190
Call Trace:
irq_do_set_affinity+0x57/0x1b0
irq_setup_affinity+0x136/0x190
irq_startup+0xaf/0xf0
__setup_irq+0x745/0x7e0
? kmem_cache_alloc_trace+0x2b5/0x450
request_threaded_irq+0x110/0x180
univ8250_setup_irq+0x1e2/0x220
serial8250_do_startup+0x481/0x760
serial8250_startup+0x1e/0x20
uart_startup.part.22+0xf8/0x260
uart_port_activate+0x41/0x60
tty_port_open+0x82/0xd0
? _raw_spin_unlock+0x16/0x30
uart_open+0x1e/0x30
tty_open+0x100/0x4d0
? cdev_purge+0x60/0x60
chrdev_open+0xa3/0x1c0
? cdev_put.part.3+0x20/0x20
do_dentry_open+0x21f/0x380
vfs_open+0x2f/0x40
path_openat+0xd67/0xf60
? page_add_file_rmap+0xad/0x140
? do_set_pte+0xd7/0x210
do_filp_open+0xc5/0x140
do_sys_openat2+0x239/0x300
? do_sys_openat2+0x239/0x300
do_sys_open+0x59/0x80
__x64_sys_openat+0x20/0x30
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
With this patch, the warning is only reported in interrupt context and when
the target CPU is not local CPU.
Signed-off-by: Yongxin Liu <yongxin.liu(a)windriver.com>
---
arch/x86/kernel/apic/msi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 44ebe25e7703..c4c73f227eeb 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -72,7 +72,7 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
* Paranoia: Validate that the interrupt target is the local
* CPU.
*/
- if (WARN_ON_ONCE(cpu != smp_processor_id())) {
+ if (in_interrupt() && WARN_ON_ONCE(cpu != smp_processor_id())) {
irq_msi_update_msg(irqd, cfg);
return ret;
}
--
2.14.5
We skip filling out the pt with scratch entries if the va range covers
the entire pt, since we later have to fill it with the PTEs for the
object pages anyway. However this might leave open a small window where
the PTEs don't point to anything valid for the HW to consume.
When for example using 2M GTT pages this fill_px() showed up as being
quite significant in perf measurements, and ends up being completely
wasted since we ignore the pt and just use the pde directly.
Anyway, currently we have our PTE construction split between alloc and
insert, which is probably slightly iffy nowadays, since the alloc
doesn't actually allocate anything anymore, instead it just sets up the
page directories and points the PTEs at the scratch page. Later when we
do the insert step we re-program the PTEs again. Better might be to
squash the alloc and insert into a single step, then bringing back this
optimisation(along with some others) should be possible.
Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Cc: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: <stable(a)vger.kernel.org> # v4.15+
---
drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 3d02c726c746..6e0e52eeb87a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -303,10 +303,7 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm,
__i915_gem_object_pin_pages(pt->base);
i915_gem_object_make_unshrinkable(pt->base);
- if (lvl ||
- gen8_pt_count(*start, end) < I915_PDES ||
- intel_vgpu_active(vm->i915))
- fill_px(pt, vm->scratch[lvl]->encode);
+ fill_px(pt, vm->scratch[lvl]->encode);
spin_lock(&pd->lock);
if (likely(!pd->entry[idx])) {
--
2.26.3
Hi,
It looks like there are multiple use-after-free accesses in
j1939_session_deactivate()
static bool j1939_session_deactivate(struct j1939_session *session)
{
bool active;
j1939_session_list_lock(session->priv);
active = j1939_session_deactivate_locked(session); //session can be freed inside
j1939_session_list_unlock(session->priv); // It causes UAF read and write
return active;
}
session can be freed by
j1939_session_deactivate_locked->j1939_session_put->__j1939_session_release->j1939_session_destroy->kfree.
Therefore it makes the unlock function perform UAF access.
Best,
Xiaochen Zou
Department of Computer Science & Engineering
University of California, Riverside
The performance reporting driver added cpu hotplug
feature but it didn't add pmu migration call in cpu
offline function.
This can create an issue incase the current designated
cpu being used to collect fme pmu data got offline,
as based on current code we are not migrating fme pmu to
new target cpu. Because of that perf will still try to
fetch data from that offline cpu and hence we will not
get counter data.
Patch fixed this issue by adding pmu_migrate_context call
in fme_perf_offline_cpu function.
Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support")
Tested-by: Xu Yilun <yilun.xu(a)intel.com>
Signed-off-by: Kajol Jain <kjain(a)linux.ibm.com>
Cc: stable(a)vger.kernel.org
---
drivers/fpga/dfl-fme-perf.c | 4 ++++
1 file changed, 4 insertions(+)
---
Changelog:
v1 -> v2:
- Add stable(a)vger.kernel.org in cc list
RFC -> PATCH v1
- Remove RFC tag
- Did nits changes on subject and commit message as suggested by Xu Yilun
- Added Tested-by tag
- Link to rfc patch: https://lkml.org/lkml/2021/6/28/112
---
diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
index 4299145ef347..b9a54583e505 100644
--- a/drivers/fpga/dfl-fme-perf.c
+++ b/drivers/fpga/dfl-fme-perf.c
@@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
priv->cpu = target;
+
+ /* Migrate fme_perf pmu events to the new target cpu */
+ perf_pmu_migrate_context(&priv->pmu, cpu, target);
+
return 0;
}
--
2.31.1
From: Tian Tao <tiantao6(a)hisilicon.com>
[ Upstream commit 7d614ab2f20503ed8766363d41f8607337571adf ]
fixed the below warning:
drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c:84:2-8: WARNING: NULL check
before some freeing functions is not needed.
Signed-off-by: Tian Tao <tiantao6(a)hisilicon.com>
Acked-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Lucas Stach <l.stach(a)pengutronix.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
index 4aa3426a9ba4..059ec31d532d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -77,8 +77,7 @@ static void etnaviv_gem_prime_release(struct etnaviv_gem_object *etnaviv_obj)
/* Don't drop the pages for imported dmabuf, as they are not
* ours, just free the array we allocated:
*/
- if (etnaviv_obj->pages)
- kvfree(etnaviv_obj->pages);
+ kvfree(etnaviv_obj->pages);
drm_prime_gem_destroy(&etnaviv_obj->base, etnaviv_obj->sgt);
}
--
2.30.2
From: Nikolay Aleksandrov <nikolay(a)nvidia.com>
Hi,
While working on per-vlan multicast snooping I found two race conditions
when multicast snooping is enabled. They're identical and happen when
the router port list is modified without the multicast lock. One requires
a PIM hello message to be received on a port and the other an MRD
advertisement. To fix them we just need to take the multicast_lock when
adding the ports to the router port list (marking them as router ports).
Tested on an affected setup by generating the required packets while
modifying the port list in parallel.
Thanks,
Nik
Nikolay Aleksandrov (2):
net: bridge: multicast: fix PIM hello router port marking race
net: bridge: multicast: fix MRD advertisement router port marking race
net/bridge/br_multicast.c | 6 ++++++
1 file changed, 6 insertions(+)
--
2.31.1
Richard reported sporadic (roughly one in 10 or so) null dereferences and
other strange behaviour for a set of automated LTP tests. Things like:
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 1516 Comm: umount Not tainted 5.10.0-yocto-standard #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
RIP: 0010:kernfs_sop_show_path+0x1b/0x60
...or these others:
RIP: 0010:do_mkdirat+0x6a/0xf0
RIP: 0010:d_alloc_parallel+0x98/0x510
RIP: 0010:do_readlinkat+0x86/0x120
There were other less common instances of some kind of a general scribble
but the common theme was mount and cgroup and a dubious dentry triggering
the NULL dereference. I was only able to reproduce it under qemu by
replicating Richard's setup as closely as possible - I never did get it
to happen on bare metal, even while keeping everything else the same.
In commit 71d883c37e8d ("cgroup_do_mount(): massage calling conventions")
we see this as a part of the overall change:
--------------
struct cgroup_subsys *ss;
- struct dentry *dentry;
[...]
- dentry = cgroup_do_mount(&cgroup_fs_type, fc->sb_flags, root,
- CGROUP_SUPER_MAGIC, ns);
[...]
- if (percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
- struct super_block *sb = dentry->d_sb;
- dput(dentry);
+ ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns);
+ if (!ret && percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
+ struct super_block *sb = fc->root->d_sb;
+ dput(fc->root);
deactivate_locked_super(sb);
msleep(10);
return restart_syscall();
}
--------------
In changing from the local "*dentry" variable to using fc->root, we now
export/leave that dentry pointer in the file context after doing the dput()
in the unlikely "is_dying" case. With LTP doing a crazy amount of back to
back mount/unmount [testcases/bin/cgroup_regression_5_1.sh] the unlikely
becomes slightly likely and then bad things happen.
A fix would be to not leave the stale reference in fc->root as follows:
--------------
dput(fc->root);
+ fc->root = NULL;
deactivate_locked_super(sb);
--------------
...but then we are just open-coding a duplicate of fc_drop_locked() so we
simply use that instead.
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Zefan Li <lizefan.x(a)bytedance.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: stable(a)vger.kernel.org # v5.1+
Reported-by: Richard Purdie <richard.purdie(a)linuxfoundation.org>
Fixes: 71d883c37e8d ("cgroup_do_mount(): massage calling conventions")
Signed-off-by: Paul Gortmaker <paul.gortmaker(a)windriver.com>
diff --git a/fs/internal.h b/fs/internal.h
index 6aeae7ef3380..728f8d70d7f1 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -61,7 +61,6 @@ extern void __init chrdev_init(void);
*/
extern const struct fs_context_operations legacy_fs_context_ops;
extern int parse_monolithic_mount_data(struct fs_context *, void *);
-extern void fc_drop_locked(struct fs_context *);
extern void vfs_clean_context(struct fs_context *fc);
extern int finish_clean_context(struct fs_context *fc);
diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h
index 37e1e8f7f08d..5b44b0195a28 100644
--- a/include/linux/fs_context.h
+++ b/include/linux/fs_context.h
@@ -139,6 +139,7 @@ extern int vfs_parse_fs_string(struct fs_context *fc, const char *key,
extern int generic_parse_monolithic(struct fs_context *fc, void *data);
extern int vfs_get_tree(struct fs_context *fc);
extern void put_fs_context(struct fs_context *fc);
+extern void fc_drop_locked(struct fs_context *fc);
/*
* sget() wrappers to be called from the ->get_tree() op.
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 1f274d7fc934..6e554743cf2b 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -1223,9 +1223,7 @@ int cgroup1_get_tree(struct fs_context *fc)
ret = cgroup_do_get_tree(fc);
if (!ret && percpu_ref_is_dying(&ctx->root->cgrp.self.refcnt)) {
- struct super_block *sb = fc->root->d_sb;
- dput(fc->root);
- deactivate_locked_super(sb);
+ fc_drop_locked(fc);
ret = 1;
}
--
2.31.1
This is the start of the stable review cycle for the 4.9.275 release.
There are 9 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.275-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.275-rc1
Juergen Gross <jgross(a)suse.com>
xen/events: reset active flag for lateeoi events later
Petr Mladek <pmladek(a)suse.com>
kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
Petr Mladek <pmladek(a)suse.com>
kthread_worker: split code for canceling the delayed work timer
Christian König <christian.koenig(a)amd.com>
drm/nouveau: fix dma_address check for CPU/GPU sync
ManYi Li <limanyi(a)uniontech.com>
scsi: sr: Return appropriate error code when disk is ejected
Hugh Dickins <hughd(a)google.com>
mm, futex: fix shared futex pgoff on shmem huge page
Yang Shi <shy828301(a)gmail.com>
mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
Alex Shi <alex.shi(a)linux.alibaba.com>
mm: add VM_WARN_ON_ONCE_PAGE() macro
Michal Hocko <mhocko(a)kernel.org>
include/linux/mmdebug.h: make VM_WARN* non-rvals
-------------
Diffstat:
Makefile | 4 +-
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 +-
drivers/scsi/sr.c | 2 +
drivers/xen/events/events_base.c | 23 +++++++++--
include/linux/hugetlb.h | 15 -------
include/linux/mmdebug.h | 21 ++++++++--
include/linux/pagemap.h | 13 +++---
kernel/futex.c | 2 +-
kernel/kthread.c | 77 ++++++++++++++++++++++++------------
mm/huge_memory.c | 29 +++++---------
mm/hugetlb.c | 5 +--
11 files changed, 112 insertions(+), 83 deletions(-)
Hi
Am 09.07.21 um 20:50 schrieb Sam Ravnborg:
> Hi Thomas,
>
> On Mon, Jul 05, 2021 at 02:45:04PM +0200, Thomas Zimmermann wrote:
>> Put the clock-selection code into each of the PLL-update functions to
>> make them select the correct pixel clock.
>>
>> The pixel clock for video output was not actually set before programming
>> the clock's values. It worked because the device had the correct clock
>> pre-set.
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
>> Fixes: db05f8d3dc87 ("drm/mgag200: Split MISC register update into PLL selection, SYNC and I/O")
>> Cc: Sam Ravnborg <sam(a)ravnborg.org>
>> Cc: Emil Velikov <emil.velikov(a)collabora.com>
>> Cc: Dave Airlie <airlied(a)redhat.com>
>> Cc: dri-devel(a)lists.freedesktop.org
>> Cc: <stable(a)vger.kernel.org> # v5.9+
>> ---
>> drivers/gpu/drm/mgag200/mgag200_mode.c | 47 ++++++++++++++++++++------
>> 1 file changed, 37 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c
>> index 3b3059f471c2..482843ebb69f 100644
>> --- a/drivers/gpu/drm/mgag200/mgag200_mode.c
>> +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
>> @@ -130,6 +130,7 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, long clock)
>> long ref_clk = mdev->model.g200.ref_clk;
>> long p_clk_min = mdev->model.g200.pclk_min;
>> long p_clk_max = mdev->model.g200.pclk_max;
>> + u8 misc;
>>
>> if (clock > p_clk_max) {
>> drm_err(dev, "Pixel Clock %ld too high\n", clock);
>> @@ -174,6 +175,11 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, long clock)
>> drm_dbg_kms(dev, "clock: %ld vco: %ld m: %d n: %d p: %d s: %d\n",
>> clock, f_vco, m, n, p, s);
>>
>> + misc = RREG8(MGA_MISC_IN);
>> + misc &= ~MGAREG_MISC_CLK_SEL_MASK;
>> + misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
>> + WREG8(MGA_MISC_OUT, misc);
>
> This chunk is repeated a number of times.
> Any good reason why this is not a small helper?
Good point. I'll make a helper from this.
Best regards
Thomas
>
> Sam
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1421ec684a43379b2aa3cfda20b03d38282dc990 Mon Sep 17 00:00:00 2001
From: Xiaochen Shen <xiaochen.shen(a)intel.com>
Date: Thu, 27 May 2021 17:31:53 +0800
Subject: [PATCH] selftests/resctrl: Fix incorrect parsing of option "-t"
Resctrl test suite accepts command line argument "-t" to specify the
unit tests to run in the test list (e.g., -t mbm,mba,cmt,cat) as
documented in the help.
When calling strtok() to parse the option, the incorrect delimiters
argument ":\t" is used. As a result, passing "-t mbm,mba,cmt,cat" throws
an invalid option error.
Fix this by using delimiters argument "," instead of ":\t" for parsing
of unit tests list. At the same time, remove the unnecessary "spaces"
between the unit tests in help documentation to prevent confusion.
Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
Fixes: 78941183d1b1 ("selftests/resctrl: Add Cache QoS Monitoring (CQM) selftest")
Fixes: ecdbb911f22d ("selftests/resctrl: Add MBM test")
Fixes: 034c7678dd2c ("selftests/resctrl: Add README for resctrl tests")
Cc: stable(a)vger.kernel.org
Signed-off-by: Xiaochen Shen <xiaochen.shen(a)intel.com>
Reviewed-by: Tony Luck <tony.luck(a)intel.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
diff --git a/tools/testing/selftests/resctrl/README b/tools/testing/selftests/resctrl/README
index 4b36b25b6ac0..3d2bbd4fa3aa 100644
--- a/tools/testing/selftests/resctrl/README
+++ b/tools/testing/selftests/resctrl/README
@@ -47,7 +47,7 @@ Parameter '-h' shows usage information.
usage: resctrl_tests [-h] [-b "benchmark_cmd [options]"] [-t test list] [-n no_of_bits]
-b benchmark_cmd [options]: run specified benchmark for MBM, MBA and CMT default benchmark is builtin fill_buf
- -t test list: run tests specified in the test list, e.g. -t mbm, mba, cmt, cat
+ -t test list: run tests specified in the test list, e.g. -t mbm,mba,cmt,cat
-n no_of_bits: run cache tests using specified no of bits in cache bit mask
-p cpu_no: specify CPU number to run the test. 1 is default
-h: help
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index f51b5fc066a3..973f09a66e1e 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -40,7 +40,7 @@ static void cmd_help(void)
printf("\t-b benchmark_cmd [options]: run specified benchmark for MBM, MBA and CMT\n");
printf("\t default benchmark is builtin fill_buf\n");
printf("\t-t test list: run tests specified in the test list, ");
- printf("e.g. -t mbm, mba, cmt, cat\n");
+ printf("e.g. -t mbm,mba,cmt,cat\n");
printf("\t-n no_of_bits: run cache tests using specified no of bits in cache bit mask\n");
printf("\t-p cpu_no: specify CPU number to run the test. 1 is default\n");
printf("\t-h: help\n");
@@ -173,7 +173,7 @@ int main(int argc, char **argv)
return -1;
}
- token = strtok(NULL, ":\t");
+ token = strtok(NULL, ",");
}
break;
case 'p':
I'm announcing the release of the 5.4.131 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/s390/include/asm/stacktrace.h | 18 +++++++++++-------
arch/x86/kvm/svm.c | 33 ++++++++++++++++++++++-----------
drivers/xen/events/events_base.c | 23 +++++++++++++++++++----
4 files changed, 53 insertions(+), 23 deletions(-)
Alper Gun (1):
KVM: SVM: Call SEV Guest Decommission if ASID binding fails
David Rientjes (1):
KVM: SVM: Periodically schedule when unregistering regions on destroy
Greg Kroah-Hartman (1):
Linux 5.4.131
Heiko Carstens (1):
s390/stack: fix possible register corruption with stack switch helper
Juergen Gross (1):
xen/events: reset active flag for lateeoi events later
This is the start of the stable review cycle for the 4.19.197 release.
There are 34 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.197-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.197-rc1
Tony Lindgren <tony(a)atomide.com>
clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940
Tony Lindgren <tony(a)atomide.com>
clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue
Tony Lindgren <tony(a)atomide.com>
clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support
afzal mohammed <afzal.mohd.ma(a)gmail.com>
ARM: OMAP: replace setup_irq() by request_irq()
Alper Gun <alpergun(a)google.com>
KVM: SVM: Call SEV Guest Decommission if ASID binding fails
Juergen Gross <jgross(a)suse.com>
xen/events: reset active flag for lateeoi events later
Petr Mladek <pmladek(a)suse.com>
kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
Petr Mladek <pmladek(a)suse.com>
kthread_worker: split code for canceling the delayed work timer
Anson Huang <Anson.Huang(a)nxp.com>
ARM: dts: imx6qdl-sabresd: Remove incorrect power supply assignment
David Rientjes <rientjes(a)google.com>
KVM: SVM: Periodically schedule when unregistering regions on destroy
Tahsin Erdogan <trdgn(a)amazon.com>
ext4: eliminate bogus error in ext4_data_block_valid_rcu()
Christian König <christian.koenig(a)amd.com>
drm/nouveau: fix dma_address check for CPU/GPU sync
ManYi Li <limanyi(a)uniontech.com>
scsi: sr: Return appropriate error code when disk is ejected
Hugh Dickins <hughd(a)google.com>
mm, futex: fix shared futex pgoff on shmem huge page
Hugh Dickins <hughd(a)google.com>
mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
Hugh Dickins <hughd(a)google.com>
mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): get vma_address_end() earlier
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): use goto instead of while (1)
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): add a level of indentation
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): crossing page table boundary
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): settle PageHuge on entry
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): use page for pvmw->page
Yang Shi <shy828301(a)gmail.com>
mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
Hugh Dickins <hughd(a)google.com>
mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
Jue Wang <juew(a)google.com>
mm/thp: fix page_address_in_vma() on file THP tails
Hugh Dickins <hughd(a)google.com>
mm/thp: fix vma_address() if virtual address below file offset
Hugh Dickins <hughd(a)google.com>
mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
Hugh Dickins <hughd(a)google.com>
mm/thp: make is_huge_zero_pmd() safe and quicker
Hugh Dickins <hughd(a)google.com>
mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
Miaohe Lin <linmiaohe(a)huawei.com>
mm/rmap: use page_not_mapped in try_to_unmap()
Miaohe Lin <linmiaohe(a)huawei.com>
mm/rmap: remove unneeded semicolon in page_not_mapped()
Alex Shi <alex.shi(a)linux.alibaba.com>
mm: add VM_WARN_ON_ONCE_PAGE() macro
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/dra7.dtsi | 11 ++
arch/arm/boot/dts/imx6qdl-sabresd.dtsi | 4 -
arch/arm/mach-omap1/pm.c | 13 ++-
arch/arm/mach-omap1/time.c | 10 +-
arch/arm/mach-omap1/timer32k.c | 10 +-
arch/arm/mach-omap2/board-generic.c | 4 +-
arch/arm/mach-omap2/timer.c | 181 +++++++++++++++++++++++----------
arch/x86/kvm/svm.c | 33 ++++--
drivers/clk/ti/clk-7xx.c | 1 +
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 +-
drivers/scsi/sr.c | 2 +
drivers/xen/events/events_base.c | 23 ++++-
fs/ext4/block_validity.c | 4 +-
include/linux/cpuhotplug.h | 1 +
include/linux/huge_mm.h | 8 +-
include/linux/hugetlb.h | 16 ---
include/linux/mm.h | 3 +
include/linux/mmdebug.h | 13 +++
include/linux/pagemap.h | 13 +--
include/linux/rmap.h | 3 +-
kernel/futex.c | 2 +-
kernel/kthread.c | 77 +++++++++-----
mm/huge_memory.c | 56 +++++-----
mm/hugetlb.c | 5 +-
mm/internal.h | 53 +++++++---
mm/memory.c | 41 ++++++++
mm/page_vma_mapped.c | 160 ++++++++++++++++++-----------
mm/pgtable-generic.c | 4 +-
mm/rmap.c | 48 +++++----
mm/truncate.c | 43 ++++----
31 files changed, 546 insertions(+), 304 deletions(-)
This is the start of the stable review cycle for the 4.14.239 release.
There are 25 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.239-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.239-rc1
Juergen Gross <jgross(a)suse.com>
xen/events: reset active flag for lateeoi events later
Petr Mladek <pmladek(a)suse.com>
kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
Petr Mladek <pmladek(a)suse.com>
kthread_worker: split code for canceling the delayed work timer
Sean Young <sean(a)mess.org>
kfifo: DECLARE_KIFO_PTR(fifo, u64) does not work on arm 32 bit
Christian König <christian.koenig(a)amd.com>
drm/nouveau: fix dma_address check for CPU/GPU sync
ManYi Li <limanyi(a)uniontech.com>
scsi: sr: Return appropriate error code when disk is ejected
Hugh Dickins <hughd(a)google.com>
mm, futex: fix shared futex pgoff on shmem huge page
Hugh Dickins <hughd(a)google.com>
mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
Hugh Dickins <hughd(a)google.com>
mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): get vma_address_end() earlier
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): use goto instead of while (1)
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): add a level of indentation
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): crossing page table boundary
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): settle PageHuge on entry
Hugh Dickins <hughd(a)google.com>
mm: page_vma_mapped_walk(): use page for pvmw->page
Yang Shi <shy828301(a)gmail.com>
mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
Jue Wang <juew(a)google.com>
mm/thp: fix page_address_in_vma() on file THP tails
Hugh Dickins <hughd(a)google.com>
mm/thp: fix vma_address() if virtual address below file offset
Hugh Dickins <hughd(a)google.com>
mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
Miaohe Lin <linmiaohe(a)huawei.com>
mm/rmap: use page_not_mapped in try_to_unmap()
Miaohe Lin <linmiaohe(a)huawei.com>
mm/rmap: remove unneeded semicolon in page_not_mapped()
Alex Shi <alex.shi(a)linux.alibaba.com>
mm: add VM_WARN_ON_ONCE_PAGE() macro
Michal Hocko <mhocko(a)kernel.org>
include/linux/mmdebug.h: make VM_WARN* non-rvals
-------------
Diffstat:
Makefile | 4 +-
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 +-
drivers/scsi/sr.c | 2 +
drivers/xen/events/events_base.c | 23 ++++-
include/linux/hugetlb.h | 16 ----
include/linux/kfifo.h | 3 +-
include/linux/mmdebug.h | 21 ++++-
include/linux/pagemap.h | 13 +--
include/linux/rmap.h | 3 +-
kernel/futex.c | 2 +-
kernel/kthread.c | 77 +++++++++++------
mm/huge_memory.c | 26 ++----
mm/hugetlb.c | 5 +-
mm/internal.h | 53 +++++++++---
mm/page_vma_mapped.c | 160 ++++++++++++++++++++++-------------
mm/rmap.c | 48 ++++++-----
16 files changed, 281 insertions(+), 179 deletions(-)
This is the start of the stable review cycle for the 5.12.16 release.
There are 11 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.12.16-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.12.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.12.16-rc1
Lorenzo Bianconi <lorenzo(a)kernel.org>
mt76: mt7921: get rid of mcu_reset function pointer
Sean Wang <sean.wang(a)mediatek.com>
mt76: mt7921: abort uncompleted scan by wifi reset
Lorenzo Bianconi <lorenzo(a)kernel.org>
mt76: mt7921: add wifi reset support
Lorenzo Bianconi <lorenzo(a)kernel.org>
mt76: dma: export mt76_dma_rx_cleanup routine
Lorenzo Bianconi <lorenzo(a)kernel.org>
mt76: dma: introduce mt76_dma_queue_reset routine
Lorenzo Bianconi <lorenzo(a)kernel.org>
mt76: mt7921: introduce __mt7921_start utility routine
Lorenzo Bianconi <lorenzo(a)kernel.org>
mt76: mt7921: introduce mt7921_run_firmware utility routine.
Lorenzo Bianconi <lorenzo(a)kernel.org>
mt76: mt7921: check mcu returned values in mt7921_start
Sid Manning <sidneym(a)codeaurora.org>
Hexagon: change jumps to must-extend in futex_atomic_*
Sid Manning <sidneym(a)codeaurora.org>
Hexagon: add target builtins to kernel
Sid Manning <sidneym(a)codeaurora.org>
Hexagon: fix build errors
-------------
Diffstat:
Makefile | 4 +-
arch/hexagon/Makefile | 6 +-
arch/hexagon/include/asm/futex.h | 4 +-
arch/hexagon/include/asm/timex.h | 3 +-
arch/hexagon/kernel/hexagon_ksyms.c | 8 +-
arch/hexagon/kernel/ptrace.c | 4 +-
arch/hexagon/lib/Makefile | 3 +-
arch/hexagon/lib/divsi3.S | 67 +++++++
arch/hexagon/lib/memcpy_likely_aligned.S | 56 ++++++
arch/hexagon/lib/modsi3.S | 46 +++++
arch/hexagon/lib/udivsi3.S | 38 ++++
arch/hexagon/lib/umodsi3.S | 36 ++++
drivers/net/wireless/mediatek/mt76/dma.c | 47 +++--
drivers/net/wireless/mediatek/mt76/mt76.h | 8 +-
drivers/net/wireless/mediatek/mt76/mt7921/init.c | 3 +-
drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 209 +++++++++++++++------
drivers/net/wireless/mediatek/mt76/mt7921/main.c | 38 ++--
drivers/net/wireless/mediatek/mt76/mt7921/mcu.c | 36 ++--
drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h | 6 +-
drivers/net/wireless/mediatek/mt76/mt7921/regs.h | 4 +
20 files changed, 508 insertions(+), 118 deletions(-)
This is the start of the stable review cycle for the 5.10.49 release.
There are 6 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 11 Jul 2021 13:14:09 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.49-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.49-rc1
Juergen Gross <jgross(a)suse.com>
xen/events: reset active flag for lateeoi events later
Sid Manning <sidneym(a)codeaurora.org>
Hexagon: change jumps to must-extend in futex_atomic_*
Sid Manning <sidneym(a)codeaurora.org>
Hexagon: add target builtins to kernel
Sid Manning <sidneym(a)codeaurora.org>
Hexagon: fix build errors
Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
media: uvcvideo: Support devices that report an OT as an entity source
Fabiano Rosas <farosas(a)linux.ibm.com>
KVM: PPC: Book3S HV: Save and restore FSCR in the P9 path
-------------
Diffstat:
Makefile | 4 +-
arch/hexagon/Makefile | 6 +--
arch/hexagon/include/asm/futex.h | 4 +-
arch/hexagon/include/asm/timex.h | 3 +-
arch/hexagon/kernel/hexagon_ksyms.c | 8 ++--
arch/hexagon/kernel/ptrace.c | 4 +-
arch/hexagon/lib/Makefile | 3 +-
arch/hexagon/lib/divsi3.S | 67 ++++++++++++++++++++++++++++++++
arch/hexagon/lib/memcpy_likely_aligned.S | 56 ++++++++++++++++++++++++++
arch/hexagon/lib/modsi3.S | 46 ++++++++++++++++++++++
arch/hexagon/lib/udivsi3.S | 38 ++++++++++++++++++
arch/hexagon/lib/umodsi3.S | 36 +++++++++++++++++
arch/powerpc/kvm/book3s_hv.c | 4 ++
drivers/media/usb/uvc/uvc_driver.c | 32 +++++++++++++++
drivers/xen/events/events_base.c | 23 +++++++++--
15 files changed, 315 insertions(+), 19 deletions(-)
The patch below does not apply to the 5.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 997135017716c33f3405e86cca5da9567b40a08e Mon Sep 17 00:00:00 2001
From: Olivier Langlois <olivier(a)trillion01.com>
Date: Wed, 23 Jun 2021 11:50:11 -0700
Subject: [PATCH] io_uring: Fix race condition when sqp thread goes to sleep
If an asynchronous completion happens before the task is preparing
itself to wait and set its state to TASK_INTERRUPTIBLE, the completion
will not wake up the sqp thread.
Cc: stable(a)vger.kernel.org
Signed-off-by: Olivier Langlois <olivier(a)trillion01.com>
Reviewed-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/d1419dc32ec6a97b453bee34dc03fa6a02797142.16244732…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index fc8637f591a6..7c545fa66f31 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6902,7 +6902,7 @@ static int io_sq_thread(void *data)
}
prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
- if (!io_sqd_events_pending(sqd)) {
+ if (!io_sqd_events_pending(sqd) && !io_run_task_work()) {
needs_sched = true;
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
io_ring_set_wakeup_flag(ctx);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 997135017716c33f3405e86cca5da9567b40a08e Mon Sep 17 00:00:00 2001
From: Olivier Langlois <olivier(a)trillion01.com>
Date: Wed, 23 Jun 2021 11:50:11 -0700
Subject: [PATCH] io_uring: Fix race condition when sqp thread goes to sleep
If an asynchronous completion happens before the task is preparing
itself to wait and set its state to TASK_INTERRUPTIBLE, the completion
will not wake up the sqp thread.
Cc: stable(a)vger.kernel.org
Signed-off-by: Olivier Langlois <olivier(a)trillion01.com>
Reviewed-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/d1419dc32ec6a97b453bee34dc03fa6a02797142.16244732…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index fc8637f591a6..7c545fa66f31 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6902,7 +6902,7 @@ static int io_sq_thread(void *data)
}
prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
- if (!io_sqd_events_pending(sqd)) {
+ if (!io_sqd_events_pending(sqd) && !io_run_task_work()) {
needs_sched = true;
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
io_ring_set_wakeup_flag(ctx);
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 997135017716c33f3405e86cca5da9567b40a08e Mon Sep 17 00:00:00 2001
From: Olivier Langlois <olivier(a)trillion01.com>
Date: Wed, 23 Jun 2021 11:50:11 -0700
Subject: [PATCH] io_uring: Fix race condition when sqp thread goes to sleep
If an asynchronous completion happens before the task is preparing
itself to wait and set its state to TASK_INTERRUPTIBLE, the completion
will not wake up the sqp thread.
Cc: stable(a)vger.kernel.org
Signed-off-by: Olivier Langlois <olivier(a)trillion01.com>
Reviewed-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/d1419dc32ec6a97b453bee34dc03fa6a02797142.16244732…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index fc8637f591a6..7c545fa66f31 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6902,7 +6902,7 @@ static int io_sq_thread(void *data)
}
prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
- if (!io_sqd_events_pending(sqd)) {
+ if (!io_sqd_events_pending(sqd) && !io_run_task_work()) {
needs_sched = true;
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
io_ring_set_wakeup_flag(ctx);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 77f30bfcfcf484da7208affd6a9e63406420bf91 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Thu, 27 May 2021 16:52:36 -0700
Subject: [PATCH] fscrypt: don't ignore minor_hash when hash is 0
When initializing a no-key name, fscrypt_fname_disk_to_usr() sets the
minor_hash to 0 if the (major) hash is 0.
This doesn't make sense because 0 is a valid hash code, so we shouldn't
ignore the filesystem-provided minor_hash in that case. Fix this by
removing the special case for 'hash == 0'.
This is an old bug that appears to have originated when the encryption
code in ext4 and f2fs was moved into fs/crypto/. The original ext4 and
f2fs code passed the hash by pointer instead of by value. So
'if (hash)' actually made sense then, as it was checking whether a
pointer was NULL. But now the hashes are passed by value, and
filesystems just pass 0 for any hashes they don't have. There is no
need to handle this any differently from the hashes actually being 0.
It is difficult to reproduce this bug, as it only made a difference in
the case where a filename's 32-bit major hash happened to be 0.
However, it probably had the largest chance of causing problems on
ubifs, since ubifs uses minor_hash to do lookups of no-key names, in
addition to using it as a readdir cookie. ext4 only uses minor_hash as
a readdir cookie, and f2fs doesn't use minor_hash at all.
Fixes: 0b81d0779072 ("fs crypto: move per-file encryption from f2fs tree to fs/crypto")
Cc: <stable(a)vger.kernel.org> # v4.6+
Link: https://lore.kernel.org/r/20210527235236.2376556-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index 6ca7d16593ff..d00455440d08 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -344,13 +344,9 @@ int fscrypt_fname_disk_to_usr(const struct inode *inode,
offsetof(struct fscrypt_nokey_name, sha256));
BUILD_BUG_ON(BASE64_CHARS(FSCRYPT_NOKEY_NAME_MAX) > NAME_MAX);
- if (hash) {
- nokey_name.dirhash[0] = hash;
- nokey_name.dirhash[1] = minor_hash;
- } else {
- nokey_name.dirhash[0] = 0;
- nokey_name.dirhash[1] = 0;
- }
+ nokey_name.dirhash[0] = hash;
+ nokey_name.dirhash[1] = minor_hash;
+
if (iname->len <= sizeof(nokey_name.bytes)) {
memcpy(nokey_name.bytes, iname->name, iname->len);
size = offsetof(struct fscrypt_nokey_name, bytes[iname->len]);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 77f30bfcfcf484da7208affd6a9e63406420bf91 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Thu, 27 May 2021 16:52:36 -0700
Subject: [PATCH] fscrypt: don't ignore minor_hash when hash is 0
When initializing a no-key name, fscrypt_fname_disk_to_usr() sets the
minor_hash to 0 if the (major) hash is 0.
This doesn't make sense because 0 is a valid hash code, so we shouldn't
ignore the filesystem-provided minor_hash in that case. Fix this by
removing the special case for 'hash == 0'.
This is an old bug that appears to have originated when the encryption
code in ext4 and f2fs was moved into fs/crypto/. The original ext4 and
f2fs code passed the hash by pointer instead of by value. So
'if (hash)' actually made sense then, as it was checking whether a
pointer was NULL. But now the hashes are passed by value, and
filesystems just pass 0 for any hashes they don't have. There is no
need to handle this any differently from the hashes actually being 0.
It is difficult to reproduce this bug, as it only made a difference in
the case where a filename's 32-bit major hash happened to be 0.
However, it probably had the largest chance of causing problems on
ubifs, since ubifs uses minor_hash to do lookups of no-key names, in
addition to using it as a readdir cookie. ext4 only uses minor_hash as
a readdir cookie, and f2fs doesn't use minor_hash at all.
Fixes: 0b81d0779072 ("fs crypto: move per-file encryption from f2fs tree to fs/crypto")
Cc: <stable(a)vger.kernel.org> # v4.6+
Link: https://lore.kernel.org/r/20210527235236.2376556-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index 6ca7d16593ff..d00455440d08 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -344,13 +344,9 @@ int fscrypt_fname_disk_to_usr(const struct inode *inode,
offsetof(struct fscrypt_nokey_name, sha256));
BUILD_BUG_ON(BASE64_CHARS(FSCRYPT_NOKEY_NAME_MAX) > NAME_MAX);
- if (hash) {
- nokey_name.dirhash[0] = hash;
- nokey_name.dirhash[1] = minor_hash;
- } else {
- nokey_name.dirhash[0] = 0;
- nokey_name.dirhash[1] = 0;
- }
+ nokey_name.dirhash[0] = hash;
+ nokey_name.dirhash[1] = minor_hash;
+
if (iname->len <= sizeof(nokey_name.bytes)) {
memcpy(nokey_name.bytes, iname->name, iname->len);
size = offsetof(struct fscrypt_nokey_name, bytes[iname->len]);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 77f30bfcfcf484da7208affd6a9e63406420bf91 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Thu, 27 May 2021 16:52:36 -0700
Subject: [PATCH] fscrypt: don't ignore minor_hash when hash is 0
When initializing a no-key name, fscrypt_fname_disk_to_usr() sets the
minor_hash to 0 if the (major) hash is 0.
This doesn't make sense because 0 is a valid hash code, so we shouldn't
ignore the filesystem-provided minor_hash in that case. Fix this by
removing the special case for 'hash == 0'.
This is an old bug that appears to have originated when the encryption
code in ext4 and f2fs was moved into fs/crypto/. The original ext4 and
f2fs code passed the hash by pointer instead of by value. So
'if (hash)' actually made sense then, as it was checking whether a
pointer was NULL. But now the hashes are passed by value, and
filesystems just pass 0 for any hashes they don't have. There is no
need to handle this any differently from the hashes actually being 0.
It is difficult to reproduce this bug, as it only made a difference in
the case where a filename's 32-bit major hash happened to be 0.
However, it probably had the largest chance of causing problems on
ubifs, since ubifs uses minor_hash to do lookups of no-key names, in
addition to using it as a readdir cookie. ext4 only uses minor_hash as
a readdir cookie, and f2fs doesn't use minor_hash at all.
Fixes: 0b81d0779072 ("fs crypto: move per-file encryption from f2fs tree to fs/crypto")
Cc: <stable(a)vger.kernel.org> # v4.6+
Link: https://lore.kernel.org/r/20210527235236.2376556-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index 6ca7d16593ff..d00455440d08 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -344,13 +344,9 @@ int fscrypt_fname_disk_to_usr(const struct inode *inode,
offsetof(struct fscrypt_nokey_name, sha256));
BUILD_BUG_ON(BASE64_CHARS(FSCRYPT_NOKEY_NAME_MAX) > NAME_MAX);
- if (hash) {
- nokey_name.dirhash[0] = hash;
- nokey_name.dirhash[1] = minor_hash;
- } else {
- nokey_name.dirhash[0] = 0;
- nokey_name.dirhash[1] = 0;
- }
+ nokey_name.dirhash[0] = hash;
+ nokey_name.dirhash[1] = minor_hash;
+
if (iname->len <= sizeof(nokey_name.bytes)) {
memcpy(nokey_name.bytes, iname->name, iname->len);
size = offsetof(struct fscrypt_nokey_name, bytes[iname->len]);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 77f30bfcfcf484da7208affd6a9e63406420bf91 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Thu, 27 May 2021 16:52:36 -0700
Subject: [PATCH] fscrypt: don't ignore minor_hash when hash is 0
When initializing a no-key name, fscrypt_fname_disk_to_usr() sets the
minor_hash to 0 if the (major) hash is 0.
This doesn't make sense because 0 is a valid hash code, so we shouldn't
ignore the filesystem-provided minor_hash in that case. Fix this by
removing the special case for 'hash == 0'.
This is an old bug that appears to have originated when the encryption
code in ext4 and f2fs was moved into fs/crypto/. The original ext4 and
f2fs code passed the hash by pointer instead of by value. So
'if (hash)' actually made sense then, as it was checking whether a
pointer was NULL. But now the hashes are passed by value, and
filesystems just pass 0 for any hashes they don't have. There is no
need to handle this any differently from the hashes actually being 0.
It is difficult to reproduce this bug, as it only made a difference in
the case where a filename's 32-bit major hash happened to be 0.
However, it probably had the largest chance of causing problems on
ubifs, since ubifs uses minor_hash to do lookups of no-key names, in
addition to using it as a readdir cookie. ext4 only uses minor_hash as
a readdir cookie, and f2fs doesn't use minor_hash at all.
Fixes: 0b81d0779072 ("fs crypto: move per-file encryption from f2fs tree to fs/crypto")
Cc: <stable(a)vger.kernel.org> # v4.6+
Link: https://lore.kernel.org/r/20210527235236.2376556-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index 6ca7d16593ff..d00455440d08 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -344,13 +344,9 @@ int fscrypt_fname_disk_to_usr(const struct inode *inode,
offsetof(struct fscrypt_nokey_name, sha256));
BUILD_BUG_ON(BASE64_CHARS(FSCRYPT_NOKEY_NAME_MAX) > NAME_MAX);
- if (hash) {
- nokey_name.dirhash[0] = hash;
- nokey_name.dirhash[1] = minor_hash;
- } else {
- nokey_name.dirhash[0] = 0;
- nokey_name.dirhash[1] = 0;
- }
+ nokey_name.dirhash[0] = hash;
+ nokey_name.dirhash[1] = minor_hash;
+
if (iname->len <= sizeof(nokey_name.bytes)) {
memcpy(nokey_name.bytes, iname->name, iname->len);
size = offsetof(struct fscrypt_nokey_name, bytes[iname->len]);