This is a note to let you know that I've just added the patch titled
IB/srpt: Disable RDMA access by the initiator
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-srpt-disable-rdma-access-by-the-initiator.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From bec40c26041de61162f7be9d2ce548c756ce0f65 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche(a)wdc.com>
Date: Wed, 3 Jan 2018 13:39:15 -0800
Subject: IB/srpt: Disable RDMA access by the initiator
From: Bart Van Assche <bart.vanassche(a)wdc.com>
commit bec40c26041de61162f7be9d2ce548c756ce0f65 upstream.
With the SRP protocol all RDMA operations are initiated by the target.
Since no RDMA operations are initiated by the initiator, do not grant
the initiator permission to submit RDMA reads or writes to the target.
Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/ulp/srpt/ib_srpt.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -959,8 +959,7 @@ static int srpt_init_ch_qp(struct srpt_r
return -ENOMEM;
attr->qp_state = IB_QPS_INIT;
- attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_READ |
- IB_ACCESS_REMOTE_WRITE;
+ attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE;
attr->port_num = ch->sport->port;
attr->pkey_index = 0;
Patches currently in stable-queue which might be from bart.vanassche(a)wdc.com are
queue-3.18/ib-srpt-disable-rdma-access-by-the-initiator.patch
This is a note to let you know that I've just added the patch titled
can: gs_usb: fix return value of the "set_bittiming" callback
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
can-gs_usb-fix-return-value-of-the-set_bittiming-callback.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d5b42e6607661b198d8b26a0c30969605b1bf5c7 Mon Sep 17 00:00:00 2001
From: Wolfgang Grandegger <wg(a)grandegger.com>
Date: Wed, 13 Dec 2017 19:52:23 +0100
Subject: can: gs_usb: fix return value of the "set_bittiming" callback
From: Wolfgang Grandegger <wg(a)grandegger.com>
commit d5b42e6607661b198d8b26a0c30969605b1bf5c7 upstream.
The "set_bittiming" callback treats a positive return value as error!
For that reason "can_changelink()" will quit silently after setting
the bittiming values without processing ctrlmode, restart-ms, etc.
Signed-off-by: Wolfgang Grandegger <wg(a)grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/can/usb/gs_usb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -430,7 +430,7 @@ static int gs_usb_set_bittiming(struct n
dev_err(netdev->dev.parent, "Couldn't set bittimings (err=%d)",
rc);
- return rc;
+ return (rc > 0) ? 0 : rc;
}
static void gs_usb_xmit_callback(struct urb *urb)
Patches currently in stable-queue which might be from wg(a)grandegger.com are
queue-3.18/can-gs_usb-fix-return-value-of-the-set_bittiming-callback.patch
commit 98b8e4e5c17bf87c1b18ed929472051dab39878c upstream
Please apply to 4.9.x and later stable kernel trees (this is as far back
as 86d9f48534e8 was applied).
This submission should have followed Option 1, but I failed to Cc:
stable in the original commit.
Calling acpi_wmi_init() at the subsys_initcall() level causes ordering
issues to appear on some systems and they are difficult to reproduce,
because there is no guaranteed ordering between subsys_initcall()
calls, so they may occur in different orders on different systems.
In particular, commit 86d9f48534e8 (mm/slab: fix kmemcg cache
creation delayed issue) exposed one of these issues where genl_init()
and acpi_wmi_init() are both called at the same initcall level, but
the former must run before the latter so as to avoid a NULL pointer
dereference.
For this reason, move the acpi_wmi_init() invocation to the
initcall_sync level which should still be early enough for things
to work correctly in the WMI land.
Link: https://marc.info/?t=151274596700002&r=1&w=2
Reported-by: Jonathan McDowell <noodles(a)earth.li>
Reported-by: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Tested-by: Jonathan McDowell <noodles(a)earth.li>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Darren Hart (VMware) <dvhart(a)infradead.org>
---
drivers/platform/x86/wmi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c
index 791449a..daa68ac 100644
--- a/drivers/platform/x86/wmi.c
+++ b/drivers/platform/x86/wmi.c
@@ -1458,5 +1458,5 @@ static void __exit acpi_wmi_exit(void)
class_unregister(&wmi_bus_class);
}
-subsys_initcall(acpi_wmi_init);
+subsys_initcall_sync(acpi_wmi_init);
module_exit(acpi_wmi_exit);
--
2.9.4
--
Darren Hart
VMware Open Source Technology Center
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
commit 68920c973254c5b71a684645c5f6f82d6732c5d6 upstream.
With upcoming CONFIG_UBSAN the following BUILD_BUG_ON in
net/mac80211/debugfs.c starts to trigger:
BUILD_BUG_ON(hw_flag_names[NUM_IEEE80211_HW_FLAGS] != (void *)0x1);
It seems, that compiler instrumentation causes some code
deoptimizations. Because of that GCC is not being able to resolve
condition in BUILD_BUG_ON() at compile time.
We could make size of hw_flag_names array unspecified and replace the
condition in BUILD_BUG_ON() with following:
ARRAY_SIZE(hw_flag_names) != NUM_IEEE80211_HW_FLAGS
That will have the same effect as before (adding new flag without
updating array will trigger build failure) except it doesn't fail with
CONFIG_UBSAN. As a bonus this patch slightly decreases size of
hw_flag_names array.
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Johannes Berg <johannes(a)sipsolutions.net>
Cc: "David S. Miller" <davem(a)davemloft.net>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
[Daniel: backport to 4.4.]
Signed-off-by: Daniel Wagner <daniel.wagner(a)siemens.com>
---
Hi,
The only stable tree which is missing this fix is 4.4. 4.1 doesn't
have 30686bf7f5b3 ("mac80211: convert HW flags to unsigned long
bitmap") which makes gcc unhappy with allmodconfig. 4.9 contains the
fix.
Thanks,
Daniel
net/mac80211/debugfs.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
index 4d2aaebd4f97..e546a987a9d3 100644
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -91,7 +91,7 @@ static const struct file_operations reset_ops = {
};
#endif
-static const char *hw_flag_names[NUM_IEEE80211_HW_FLAGS + 1] = {
+static const char *hw_flag_names[] = {
#define FLAG(F) [IEEE80211_HW_##F] = #F
FLAG(HAS_RATE_CONTROL),
FLAG(RX_INCLUDES_FCS),
@@ -125,9 +125,6 @@ static const char *hw_flag_names[NUM_IEEE80211_HW_FLAGS + 1] = {
FLAG(TDLS_WIDER_BW),
FLAG(SUPPORTS_AMSDU_IN_AMPDU),
FLAG(BEACON_TX_STATUS),
-
- /* keep last for the build bug below */
- (void *)0x1
#undef FLAG
};
@@ -147,7 +144,7 @@ static ssize_t hwflags_read(struct file *file, char __user *user_buf,
/* fail compilation if somebody adds or removes
* a flag without updating the name array above
*/
- BUILD_BUG_ON(hw_flag_names[NUM_IEEE80211_HW_FLAGS] != (void *)0x1);
+ BUILD_BUG_ON(ARRAY_SIZE(hw_flag_names) != NUM_IEEE80211_HW_FLAGS);
for (i = 0; i < NUM_IEEE80211_HW_FLAGS; i++) {
if (test_bit(i, local->hw.flags))
--
2.14.3
From: Peter Zijlstra <peterz(a)infradead.org>
commit 321027c1fe77f892f4ea07846aeae08cefbbb290 upstream.
Di Shen reported a race between two concurrent sys_perf_event_open()
calls where both try and move the same pre-existing software group
into a hardware context.
The problem is exactly that described in commit:
f63a8daa5812 ("perf: Fix event->ctx locking")
... where, while we wait for a ctx->mutex acquisition, the event->ctx
relation can have changed under us.
That very same commit failed to recognise sys_perf_event_context() as an
external access vector to the events and thereby didn't apply the
established locking rules correctly.
So while one sys_perf_event_open() call is stuck waiting on
mutex_lock_double(), the other (which owns said locks) moves the group
about. So by the time the former sys_perf_event_open() acquires the
locks, the context we've acquired is stale (and possibly dead).
Apply the established locking rules as per perf_event_ctx_lock_nested()
to the mutex_lock_double() for the 'move_group' case. This obviously means
we need to validate state after we acquire the locks.
Reported-by: Di Shen (Keen Lab)
Tested-by: John Dias <joaodias(a)google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Min Chong <mchong(a)google.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Stephane Eranian <eranian(a)google.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vince Weaver <vincent.weaver(a)maine.edu>
Fixes: f63a8daa5812 ("perf: Fix event->ctx locking")
Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
[bwh: Backported to 3.16:
- Use ACCESS_ONCE() instead of READ_ONCE()
- Test perf_event::group_flags instead of group_caps
- Add the err_locked cleanup block, which we didn't need before
- Adjust context]
Signed-off-by: Ben Hutchings <ben(a)decadent.org.uk>
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Signed-off-by: Amit Pundir <amit.pundir(a)linaro.org>
---
This upstream patch is featured in recent Android Security bulletin.
Picked up this backported patch from android-3.18. Build tested on 3.18.91
kernel/events/core.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 57 insertions(+), 4 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9b12efcefdf7..de3303aab7d6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7414,6 +7414,37 @@ static void mutex_lock_double(struct mutex *a, struct mutex *b)
mutex_lock_nested(b, SINGLE_DEPTH_NESTING);
}
+/*
+ * Variation on perf_event_ctx_lock_nested(), except we take two context
+ * mutexes.
+ */
+static struct perf_event_context *
+__perf_event_ctx_lock_double(struct perf_event *group_leader,
+ struct perf_event_context *ctx)
+{
+ struct perf_event_context *gctx;
+
+again:
+ rcu_read_lock();
+ gctx = ACCESS_ONCE(group_leader->ctx);
+ if (!atomic_inc_not_zero(&gctx->refcount)) {
+ rcu_read_unlock();
+ goto again;
+ }
+ rcu_read_unlock();
+
+ mutex_lock_double(&gctx->mutex, &ctx->mutex);
+
+ if (group_leader->ctx != gctx) {
+ mutex_unlock(&ctx->mutex);
+ mutex_unlock(&gctx->mutex);
+ put_ctx(gctx);
+ goto again;
+ }
+
+ return gctx;
+}
+
/**
* sys_perf_event_open - open a performance event, associate it to a task/cpu
*
@@ -7626,14 +7657,31 @@ SYSCALL_DEFINE5(perf_event_open,
}
if (move_group) {
- gctx = group_leader->ctx;
+ gctx = __perf_event_ctx_lock_double(group_leader, ctx);
+
+ /*
+ * Check if we raced against another sys_perf_event_open() call
+ * moving the software group underneath us.
+ */
+ if (!(group_leader->group_flags & PERF_GROUP_SOFTWARE)) {
+ /*
+ * If someone moved the group out from under us, check
+ * if this new event wound up on the same ctx, if so
+ * its the regular !move_group case, otherwise fail.
+ */
+ if (gctx != ctx) {
+ err = -EINVAL;
+ goto err_locked;
+ } else {
+ perf_event_ctx_unlock(group_leader, gctx);
+ move_group = 0;
+ }
+ }
/*
* See perf_event_ctx_lock() for comments on the details
* of swizzling perf_event::ctx.
*/
- mutex_lock_double(&gctx->mutex, &ctx->mutex);
-
perf_remove_from_context(group_leader, false);
/*
@@ -7674,7 +7722,7 @@ SYSCALL_DEFINE5(perf_event_open,
perf_unpin_context(ctx);
if (move_group) {
- mutex_unlock(&gctx->mutex);
+ perf_event_ctx_unlock(group_leader, gctx);
put_ctx(gctx);
}
mutex_unlock(&ctx->mutex);
@@ -7703,6 +7751,11 @@ SYSCALL_DEFINE5(perf_event_open,
fd_install(event_fd, event_file);
return event_fd;
+err_locked:
+ if (move_group)
+ perf_event_ctx_unlock(group_leader, gctx);
+ mutex_unlock(&ctx->mutex);
+ fput(event_file);
err_context:
perf_unpin_context(ctx);
put_ctx(ctx);
--
2.7.4
The patch below does not apply to the 3.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c8c5a3a24d395b14447a9a89d61586a913840a3b Mon Sep 17 00:00:00 2001
From: "Maciej W. Rozycki" <macro(a)mips.com>
Date: Mon, 11 Dec 2017 22:56:54 +0000
Subject: [PATCH] MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset
accesses
Complement commit c23b3d1a5311 ("MIPS: ptrace: Change GP regset to use
correct core dump register layout") and also reject outsized
PTRACE_SETREGSET requests to the NT_PRFPREG regset, like with the
NT_PRSTATUS regset.
Signed-off-by: Maciej W. Rozycki <macro(a)mips.com>
Fixes: c23b3d1a5311 ("MIPS: ptrace: Change GP regset to use correct core dump register layout")
Cc: James Hogan <james.hogan(a)mips.com>
Cc: Paul Burton <Paul.Burton(a)mips.com>
Cc: Alex Smith <alex(a)alex-smith.me.uk>
Cc: Dave Martin <Dave.Martin(a)arm.com>
Cc: linux-mips(a)linux-mips.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v3.17+
Patchwork: https://patchwork.linux-mips.org/patch/17930/
Signed-off-by: Ralf Baechle <ralf(a)linux-mips.org>
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index 256908951a7c..0b23b1ad99e6 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -550,6 +550,9 @@ static int fpr_set(struct task_struct *target,
BUG_ON(count % sizeof(elf_fpreg_t));
+ if (pos + count > sizeof(elf_fpregset_t))
+ return -EIO;
+
init_fp_ctx(target);
if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
The patch below does not apply to the 3.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From be07a6a1188372b6d19a3307ec33211fc9c9439d Mon Sep 17 00:00:00 2001
From: "Maciej W. Rozycki" <macro(a)mips.com>
Date: Mon, 11 Dec 2017 22:54:33 +0000
Subject: [PATCH] MIPS: Fix an FCSR access API regression with NT_PRFPREG and
MSA
Fix a commit 72b22bbad1e7 ("MIPS: Don't assume 64-bit FP registers for
FP regset") public API regression, then activated by commit 1db1af84d6df
("MIPS: Basic MSA context switching support"), that caused the FCSR
register not to be read or written for CONFIG_CPU_HAS_MSA kernel
configurations (regardless of actual presence or absence of the MSA
feature in a given processor) with ptrace(2) PTRACE_GETREGSET and
PTRACE_SETREGSET requests nor recorded in core dumps.
This is because with !CONFIG_CPU_HAS_MSA configurations the whole of
`elf_fpregset_t' array is bulk-copied as it is, which includes the FCSR
in one half of the last, 33rd slot, whereas with CONFIG_CPU_HAS_MSA
configurations array elements are copied individually, and then only the
leading 32 FGR slots while the remaining slot is ignored.
Correct the code then such that only FGR slots are copied in the
respective !MSA and MSA helpers an then the FCSR slot is handled
separately in common code. Use `ptrace_setfcr31' to update the FCSR
too, so that the read-only mask is respected.
Retrieving a correct value of FCSR is important in debugging not only
for the human to be able to get the right interpretation of the
situation, but for correct operation of GDB as well. This is because
the condition code bits in FSCR are used by GDB to determine the
location to place a breakpoint at when single-stepping through an FPU
branch instruction. If such a breakpoint is placed incorrectly (i.e.
with the condition reversed), then it will be missed, likely causing the
debuggee to run away from the control of GDB and consequently breaking
the process of investigation.
Fortunately GDB continues using the older PTRACE_GETFPREGS ptrace(2)
request which is unaffected, so the regression only really hits with
post-mortem debug sessions using a core dump file, in which case
execution, and consequently single-stepping through branches is not
possible. Of course core files created by buggy kernels out there will
have the value of FCSR recorded clobbered, but such core files cannot be
corrected and the person using them simply will have to be aware that
the value of FCSR retrieved is not reliable.
Which also means we can likely get away without defining a replacement
API which would ensure a correct value of FSCR to be retrieved, or none
at all.
This is based on previous work by Alex Smith, extensively rewritten.
Signed-off-by: Alex Smith <alex(a)alex-smith.me.uk>
Signed-off-by: James Hogan <james.hogan(a)mips.com>
Signed-off-by: Maciej W. Rozycki <macro(a)mips.com>
Fixes: 72b22bbad1e7 ("MIPS: Don't assume 64-bit FP registers for FP regset")
Cc: Paul Burton <Paul.Burton(a)mips.com>
Cc: Dave Martin <Dave.Martin(a)arm.com>
Cc: linux-mips(a)linux-mips.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v3.15+
Patchwork: https://patchwork.linux-mips.org/patch/17928/
Signed-off-by: Ralf Baechle <ralf(a)linux-mips.org>
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index 47a01d5f26ea..0a939593ccb7 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -422,7 +422,7 @@ static int gpr64_set(struct task_struct *target,
/*
* Copy the floating-point context to the supplied NT_PRFPREG buffer,
* !CONFIG_CPU_HAS_MSA variant. FP context's general register slots
- * correspond 1:1 to buffer slots.
+ * correspond 1:1 to buffer slots. Only general registers are copied.
*/
static int fpr_get_fpa(struct task_struct *target,
unsigned int *pos, unsigned int *count,
@@ -430,13 +430,14 @@ static int fpr_get_fpa(struct task_struct *target,
{
return user_regset_copyout(pos, count, kbuf, ubuf,
&target->thread.fpu,
- 0, sizeof(elf_fpregset_t));
+ 0, NUM_FPU_REGS * sizeof(elf_fpreg_t));
}
/*
* Copy the floating-point context to the supplied NT_PRFPREG buffer,
* CONFIG_CPU_HAS_MSA variant. Only lower 64 bits of FP context's
- * general register slots are copied to buffer slots.
+ * general register slots are copied to buffer slots. Only general
+ * registers are copied.
*/
static int fpr_get_msa(struct task_struct *target,
unsigned int *pos, unsigned int *count,
@@ -458,20 +459,29 @@ static int fpr_get_msa(struct task_struct *target,
return 0;
}
-/* Copy the floating-point context to the supplied NT_PRFPREG buffer. */
+/*
+ * Copy the floating-point context to the supplied NT_PRFPREG buffer.
+ * Choose the appropriate helper for general registers, and then copy
+ * the FCSR register separately.
+ */
static int fpr_get(struct task_struct *target,
const struct user_regset *regset,
unsigned int pos, unsigned int count,
void *kbuf, void __user *ubuf)
{
+ const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t);
int err;
- /* XXX fcr31 */
-
if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf);
else
err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf);
+ if (err)
+ return err;
+
+ err = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &target->thread.fpu.fcr31,
+ fcr31_pos, fcr31_pos + sizeof(u32));
return err;
}
@@ -479,7 +489,7 @@ static int fpr_get(struct task_struct *target,
/*
* Copy the supplied NT_PRFPREG buffer to the floating-point context,
* !CONFIG_CPU_HAS_MSA variant. Buffer slots correspond 1:1 to FP
- * context's general register slots.
+ * context's general register slots. Only general registers are copied.
*/
static int fpr_set_fpa(struct task_struct *target,
unsigned int *pos, unsigned int *count,
@@ -487,13 +497,14 @@ static int fpr_set_fpa(struct task_struct *target,
{
return user_regset_copyin(pos, count, kbuf, ubuf,
&target->thread.fpu,
- 0, sizeof(elf_fpregset_t));
+ 0, NUM_FPU_REGS * sizeof(elf_fpreg_t));
}
/*
* Copy the supplied NT_PRFPREG buffer to the floating-point context,
* CONFIG_CPU_HAS_MSA variant. Buffer slots are copied to lower 64
- * bits only of FP context's general register slots.
+ * bits only of FP context's general register slots. Only general
+ * registers are copied.
*/
static int fpr_set_msa(struct task_struct *target,
unsigned int *pos, unsigned int *count,
@@ -518,6 +529,8 @@ static int fpr_set_msa(struct task_struct *target,
/*
* Copy the supplied NT_PRFPREG buffer to the floating-point context.
+ * Choose the appropriate helper for general registers, and then copy
+ * the FCSR register separately.
*
* We optimize for the case where `count % sizeof(elf_fpreg_t) == 0',
* which is supposed to have been guaranteed by the kernel before
@@ -530,18 +543,30 @@ static int fpr_set(struct task_struct *target,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf)
{
+ const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t);
+ u32 fcr31;
int err;
BUG_ON(count % sizeof(elf_fpreg_t));
- /* XXX fcr31 */
-
init_fp_ctx(target);
if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf);
else
err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf);
+ if (err)
+ return err;
+
+ if (count > 0) {
+ err = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &fcr31,
+ fcr31_pos, fcr31_pos + sizeof(u32));
+ if (err)
+ return err;
+
+ ptrace_setfcr31(target, fcr31);
+ }
return err;
}
The patch below does not apply to the 3.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 80b3ffce0196ea50068885d085ff981e4b8396f4 Mon Sep 17 00:00:00 2001
From: "Maciej W. Rozycki" <macro(a)mips.com>
Date: Mon, 11 Dec 2017 22:53:14 +0000
Subject: [PATCH] MIPS: Consistently handle buffer counter with
PTRACE_SETREGSET
Update commit d614fd58a283 ("mips/ptrace: Preserve previous registers
for short regset write") bug and consistently consume all data supplied
to `fpr_set_msa' with the ptrace(2) PTRACE_SETREGSET request, such that
a zero data buffer counter is returned where insufficient data has been
given to fill a whole number of FP general registers.
In reality this is not going to happen, as the caller is supposed to
only supply data covering a whole number of registers and it is verified
in `ptrace_regset' and again asserted in `fpr_set', however structuring
code such that the presence of trailing partial FP general register data
causes `fpr_set_msa' to return with a non-zero data buffer counter makes
it appear that this trailing data will be used if there are subsequent
writes made to FP registers, which is going to be the case with the FCSR
once the missing write to that register has been fixed.
Fixes: d614fd58a283 ("mips/ptrace: Preserve previous registers for short regset write")
Signed-off-by: Maciej W. Rozycki <macro(a)mips.com>
Cc: James Hogan <james.hogan(a)mips.com>
Cc: Paul Burton <Paul.Burton(a)mips.com>
Cc: Alex Smith <alex(a)alex-smith.me.uk>
Cc: Dave Martin <Dave.Martin(a)arm.com>
Cc: linux-mips(a)linux-mips.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v4.11+
Patchwork: https://patchwork.linux-mips.org/patch/17927/
Signed-off-by: Ralf Baechle <ralf(a)linux-mips.org>
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index 7fcadaaf330f..47a01d5f26ea 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -504,7 +504,7 @@ static int fpr_set_msa(struct task_struct *target,
int err;
BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t));
- for (i = 0; i < NUM_FPU_REGS && *count >= sizeof(elf_fpreg_t); i++) {
+ for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) {
err = user_regset_copyin(pos, count, kbuf, ubuf,
&fpr_val, i * sizeof(elf_fpreg_t),
(i + 1) * sizeof(elf_fpreg_t));
The patch below does not apply to the 3.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From dc24d0edf33c3e15099688b6bbdf7bdc24bf6e91 Mon Sep 17 00:00:00 2001
From: "Maciej W. Rozycki" <macro(a)mips.com>
Date: Mon, 11 Dec 2017 22:52:15 +0000
Subject: [PATCH] MIPS: Guard against any partial write attempt with
PTRACE_SETREGSET
Complement commit d614fd58a283 ("mips/ptrace: Preserve previous
registers for short regset write") and ensure that no partial register
write attempt is made with PTRACE_SETREGSET, as we do not preinitialize
any temporaries used to hold incoming register data and consequently
random data could be written.
It is the responsibility of the caller, such as `ptrace_regset', to
arrange for writes to span whole registers only, so here we only assert
that it has indeed happened.
Signed-off-by: Maciej W. Rozycki <macro(a)mips.com>
Fixes: 72b22bbad1e7 ("MIPS: Don't assume 64-bit FP registers for FP regset")
Cc: James Hogan <james.hogan(a)mips.com>
Cc: Paul Burton <Paul.Burton(a)mips.com>
Cc: Alex Smith <alex(a)alex-smith.me.uk>
Cc: Dave Martin <Dave.Martin(a)arm.com>
Cc: linux-mips(a)linux-mips.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v3.15+
Patchwork: https://patchwork.linux-mips.org/patch/17926/
Signed-off-by: Ralf Baechle <ralf(a)linux-mips.org>
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index 62e8ffd9370a..7fcadaaf330f 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -516,7 +516,15 @@ static int fpr_set_msa(struct task_struct *target,
return 0;
}
-/* Copy the supplied NT_PRFPREG buffer to the floating-point context. */
+/*
+ * Copy the supplied NT_PRFPREG buffer to the floating-point context.
+ *
+ * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0',
+ * which is supposed to have been guaranteed by the kernel before
+ * calling us, e.g. in `ptrace_regset'. We enforce that requirement,
+ * so that we can safely avoid preinitializing temporaries for
+ * partial register writes.
+ */
static int fpr_set(struct task_struct *target,
const struct user_regset *regset,
unsigned int pos, unsigned int count,
@@ -524,6 +532,8 @@ static int fpr_set(struct task_struct *target,
{
int err;
+ BUG_ON(count % sizeof(elf_fpreg_t));
+
/* XXX fcr31 */
init_fp_ctx(target);
On Tue, Jan 09, 2018 at 08:49:16PM +0000, Mathieu Desnoyers wrote:
> Hi Greg,
>
> commit e39d200fa "KVM: Fix stack-out-of-bounds read in write_mmio"
> is upstream in latest 4.15-rc tags, is also in long term
> 3.16.52+ and 3.2.97+ stable kernels, and in latest Fedora kernels,
> but is missing from 4.14, 4.9, 4.4, 4.1 stable branches.
>
> I think the patch author missed the CC to stable(a)vger.kernel.org
>
> It might be good to consider picking it up into the stable kernels
> to ensure long term stable and distro kernels don't diverge too much
> from more recent stable kernel branches.
Thanks, now picked up.
I wonder how it got into the 3.16 and 3.2 kernels :(
Also, always cc: stable(a)vger.kernel.org so that all of the stable
maintainers can see it, as I am not in charge of all of them (i.e.
4.1)
thanks,
greg k-h
The bounce buffer is gone from the MMC core, and now we found out
that there are some (crippled) i.MX boards out there that have broken
ADMA (cannot do scatter-gather), and broken PIO so they must use
SDMA. Closer examination shows a less significant slowdown also on
SDMA-only capable Laptop hosts.
SDMA sets down the number of segments to one, so that each segment
gets turned into a singular request that ping-pongs to the block
layer before the next request/segment is issued.
Apparently it happens a lot that the block layer send requests
that include a lot of physically discontigous segments. My guess
is that this phenomenon is coming from the file system.
These devices that cannot handle scatterlists in hardware can see
major benefits from a DMA-contigous bounce buffer.
This patch accumulates those fragmented scatterlists in a physically
contigous bounce buffer so that we can issue bigger DMA data chunks
to/from the card.
When tested with thise PCI-integrated host (1217:8221) that
only supports SDMA:
0b:00.0 SD Host controller: O2 Micro, Inc. OZ600FJ0/OZ900FJ0/OZ600FJS
SD/MMC Card Reader Controller (rev 05)
This patch gave ~1Mbyte/s improved throughput on large reads and
writes when testing using iozone than without the patch.
On the i.MX SDHCI controllers on the crippled i.MX 25 and i.MX 35
the patch restores the performance to what it was before we removed
the bounce buffers, and then some: performance is better than ever
because we now allocate a bounce buffer the size of the maximum
single request the SDMA engine can handle. On the PCI laptop this
is 256K, whereas with the old bounce buffer code it was 64K max.
Cc: Pierre Ossman <pierre(a)ossman.eu>
Cc: Benoît Thébaudeau <benoit(a)wsystem.com>
Cc: Fabio Estevam <fabio.estevam(a)nxp.com>
Cc: stable(a)vger.kernel.org
Fixes: de3ee99b097d ("mmc: Delete bounce buffer handling")
Tested-by: Benjamin Beckmeyer <beckmeyer.b(a)rittal.de>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
ChangeLog v2->v3:
- Rewrite the commit message a bit
- Add Benjamin's Tested-by
- Add Fixes and stable tags
ChangeLog v1->v2:
- Skip the remapping and fiddling with the buffer, instead use
dma_alloc_coherent() and use a simple, coherent bounce buffer.
- Couple kernel messages to ->parent of the mmc_host as it relates
to the hardware characteristics.
---
drivers/mmc/host/sdhci.c | 94 +++++++++++++++++++++++++++++++++++++++++++-----
drivers/mmc/host/sdhci.h | 3 ++
2 files changed, 89 insertions(+), 8 deletions(-)
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index e9290a3439d5..97d4c6fc1159 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -502,8 +502,20 @@ static int sdhci_pre_dma_transfer(struct sdhci_host *host,
if (data->host_cookie == COOKIE_PRE_MAPPED)
return data->sg_count;
- sg_count = dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
- mmc_get_dma_dir(data));
+ /* Bounce write requests to the bounce buffer */
+ if (host->bounce_buffer) {
+ if (mmc_get_dma_dir(data) == DMA_TO_DEVICE) {
+ /* Copy the data to the bounce buffer */
+ sg_copy_to_buffer(data->sg, data->sg_len,
+ host->bounce_buffer, host->bounce_buffer_size);
+ }
+ /* Just a dummy value */
+ sg_count = 1;
+ } else {
+ /* Just access the data directly from memory */
+ sg_count = dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+ mmc_get_dma_dir(data));
+ }
if (sg_count == 0)
return -ENOSPC;
@@ -858,8 +870,13 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd)
SDHCI_ADMA_ADDRESS_HI);
} else {
WARN_ON(sg_cnt != 1);
- sdhci_writel(host, sg_dma_address(data->sg),
- SDHCI_DMA_ADDRESS);
+ /* Bounce buffer goes to work */
+ if (host->bounce_buffer)
+ sdhci_writel(host, host->bounce_addr,
+ SDHCI_DMA_ADDRESS);
+ else
+ sdhci_writel(host, sg_dma_address(data->sg),
+ SDHCI_DMA_ADDRESS);
}
}
@@ -2248,7 +2265,12 @@ static void sdhci_pre_req(struct mmc_host *mmc, struct mmc_request *mrq)
mrq->data->host_cookie = COOKIE_UNMAPPED;
- if (host->flags & SDHCI_REQ_USE_DMA)
+ /*
+ * No pre-mapping in the pre hook if we're using the bounce buffer,
+ * for that we would need two bounce buffers since one buffer is
+ * in flight when this is getting called.
+ */
+ if (host->flags & SDHCI_REQ_USE_DMA && !host->bounce_buffer)
sdhci_pre_dma_transfer(host, mrq->data, COOKIE_PRE_MAPPED);
}
@@ -2352,8 +2374,19 @@ static bool sdhci_request_done(struct sdhci_host *host)
struct mmc_data *data = mrq->data;
if (data && data->host_cookie == COOKIE_MAPPED) {
- dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
- mmc_get_dma_dir(data));
+ if (host->bounce_buffer) {
+ /* On reads, copy the bounced data into the sglist */
+ if (mmc_get_dma_dir(data) == DMA_FROM_DEVICE) {
+ sg_copy_from_buffer(data->sg, data->sg_len,
+ host->bounce_buffer,
+ host->bounce_buffer_size);
+ }
+ } else {
+ /* Unmap the raw data */
+ dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+ data->sg_len,
+ mmc_get_dma_dir(data));
+ }
data->host_cookie = COOKIE_UNMAPPED;
}
}
@@ -2636,7 +2669,12 @@ static void sdhci_data_irq(struct sdhci_host *host, u32 intmask)
*/
if (intmask & SDHCI_INT_DMA_END) {
u32 dmastart, dmanow;
- dmastart = sg_dma_address(host->data->sg);
+
+ if (host->bounce_buffer)
+ dmastart = host->bounce_addr;
+ else
+ dmastart = sg_dma_address(host->data->sg);
+
dmanow = dmastart + host->data->bytes_xfered;
/*
* Force update to the next DMA block boundary.
@@ -3713,6 +3751,43 @@ int sdhci_setup_host(struct sdhci_host *host)
*/
mmc->max_blk_count = (host->quirks & SDHCI_QUIRK_NO_MULTIBLOCK) ? 1 : 65535;
+ if (mmc->max_segs == 1) {
+ unsigned int max_blocks;
+ unsigned int max_seg_size;
+
+ max_seg_size = mmc->max_req_size;
+ max_blocks = max_seg_size / 512;
+ dev_info(mmc->parent, "host only supports SDMA, activate bounce buffer\n");
+
+ /*
+ * When we just support one segment, we can get significant speedups
+ * by the help of a bounce buffer to group scattered reads/writes
+ * together.
+ *
+ * TODO: is this too big? Stealing too much memory? The old bounce
+ * buffer is max 64K. This should be the 512K that SDMA can handle
+ * if I read the code above right. Anyways let's try this.
+ * FIXME: use devm_*
+ */
+ host->bounce_buffer = dma_alloc_coherent(mmc->parent, max_seg_size,
+ &host->bounce_addr, GFP_KERNEL);
+ if (!host->bounce_buffer) {
+ dev_err(mmc->parent,
+ "failed to allocate %u bytes for bounce buffer\n",
+ max_seg_size);
+ return -ENOMEM;
+ }
+ host->bounce_buffer_size = max_seg_size;
+
+ /* Lie about this since we're bouncing */
+ mmc->max_segs = max_blocks;
+ mmc->max_seg_size = max_seg_size;
+
+ dev_info(mmc->parent,
+ "bounce buffer: bounce up to %u segments into one, max segment size %u bytes\n",
+ max_blocks, max_seg_size);
+ }
+
return 0;
unreg:
@@ -3743,6 +3818,9 @@ void sdhci_cleanup_host(struct sdhci_host *host)
host->align_addr);
host->adma_table = NULL;
host->align_buffer = NULL;
+ if (host->bounce_buffer)
+ dma_free_coherent(mmc->parent, host->bounce_buffer_size,
+ host->bounce_buffer, host->bounce_addr);
}
EXPORT_SYMBOL_GPL(sdhci_cleanup_host);
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 54bc444c317f..865e09618d22 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -440,6 +440,9 @@ struct sdhci_host {
int irq; /* Device IRQ */
void __iomem *ioaddr; /* Mapped address */
+ char *bounce_buffer; /* For packing SDMA reads/writes */
+ dma_addr_t bounce_addr;
+ size_t bounce_buffer_size;
const struct sdhci_ops *ops; /* Low level hw interface */
--
2.14.3
This is a note to let you know that I've just added the patch titled
mac80211: Add RX flag to indicate ICV stripped
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mac80211-add-rx-flag-to-indicate-icv-stripped.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From cef0acd4d7d4811d2d19cd0195031bf0dfe41249 Mon Sep 17 00:00:00 2001
From: David Spinadel <david.spinadel(a)intel.com>
Date: Mon, 21 Nov 2016 16:58:40 +0200
Subject: mac80211: Add RX flag to indicate ICV stripped
From: David Spinadel <david.spinadel(a)intel.com>
commit cef0acd4d7d4811d2d19cd0195031bf0dfe41249 upstream.
Add a flag that indicates that the WEP ICV was stripped from an
RX packet, allowing the device to not transfer that if it's
already checked.
Signed-off-by: David Spinadel <david.spinadel(a)intel.com>
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Cc: Kalle Valo <kvalo(a)codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/mac80211.h | 5 ++++-
net/mac80211/wep.c | 3 ++-
net/mac80211/wpa.c | 3 ++-
3 files changed, 8 insertions(+), 3 deletions(-)
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -1007,7 +1007,7 @@ ieee80211_tx_info_clear_status(struct ie
* @RX_FLAG_DECRYPTED: This frame was decrypted in hardware.
* @RX_FLAG_MMIC_STRIPPED: the Michael MIC is stripped off this frame,
* verification has been done by the hardware.
- * @RX_FLAG_IV_STRIPPED: The IV/ICV are stripped from this frame.
+ * @RX_FLAG_IV_STRIPPED: The IV and ICV are stripped from this frame.
* If this flag is set, the stack cannot do any replay detection
* hence the driver or hardware will have to do that.
* @RX_FLAG_PN_VALIDATED: Currently only valid for CCMP/GCMP frames, this
@@ -1078,6 +1078,8 @@ ieee80211_tx_info_clear_status(struct ie
* @RX_FLAG_ALLOW_SAME_PN: Allow the same PN as same packet before.
* This is used for AMSDU subframes which can have the same PN as
* the first subframe.
+ * @RX_FLAG_ICV_STRIPPED: The ICV is stripped from this frame. CRC checking must
+ * be done in the hardware.
*/
enum mac80211_rx_flags {
RX_FLAG_MMIC_ERROR = BIT(0),
@@ -1113,6 +1115,7 @@ enum mac80211_rx_flags {
RX_FLAG_RADIOTAP_VENDOR_DATA = BIT(31),
RX_FLAG_MIC_STRIPPED = BIT_ULL(32),
RX_FLAG_ALLOW_SAME_PN = BIT_ULL(33),
+ RX_FLAG_ICV_STRIPPED = BIT_ULL(34),
};
#define RX_FLAG_STBC_SHIFT 26
--- a/net/mac80211/wep.c
+++ b/net/mac80211/wep.c
@@ -293,7 +293,8 @@ ieee80211_crypto_wep_decrypt(struct ieee
return RX_DROP_UNUSABLE;
ieee80211_wep_remove_iv(rx->local, rx->skb, rx->key);
/* remove ICV */
- if (pskb_trim(rx->skb, rx->skb->len - IEEE80211_WEP_ICV_LEN))
+ if (!(status->flag & RX_FLAG_ICV_STRIPPED) &&
+ pskb_trim(rx->skb, rx->skb->len - IEEE80211_WEP_ICV_LEN))
return RX_DROP_UNUSABLE;
}
--- a/net/mac80211/wpa.c
+++ b/net/mac80211/wpa.c
@@ -295,7 +295,8 @@ ieee80211_crypto_tkip_decrypt(struct iee
return RX_DROP_UNUSABLE;
/* Trim ICV */
- skb_trim(skb, skb->len - IEEE80211_TKIP_ICV_LEN);
+ if (!(status->flag & RX_FLAG_ICV_STRIPPED))
+ skb_trim(skb, skb->len - IEEE80211_TKIP_ICV_LEN);
/* Remove IV */
memmove(skb->data + IEEE80211_TKIP_IV_LEN, skb->data, hdrlen);
Patches currently in stable-queue which might be from david.spinadel(a)intel.com are
queue-4.9/mac80211-add-rx-flag-to-indicate-icv-stripped.patch
This is a note to let you know that I've just added the patch titled
dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-bufio-fix-shrinker-scans-when-nr_to_scan-retain_target.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fbc7c07ec23c040179384a1f16b62b6030eb6bdd Mon Sep 17 00:00:00 2001
From: Suren Baghdasaryan <surenb(a)google.com>
Date: Wed, 6 Dec 2017 09:27:30 -0800
Subject: dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
From: Suren Baghdasaryan <surenb(a)google.com>
commit fbc7c07ec23c040179384a1f16b62b6030eb6bdd upstream.
When system is under memory pressure it is observed that dm bufio
shrinker often reclaims only one buffer per scan. This change fixes
the following two issues in dm bufio shrinker that cause this behavior:
1. ((nr_to_scan - freed) <= retain_target) condition is used to
terminate slab scan process. This assumes that nr_to_scan is equal
to the LRU size, which might not be correct because do_shrink_slab()
in vmscan.c calculates nr_to_scan using multiple inputs.
As a result when nr_to_scan is less than retain_target (64) the scan
will terminate after the first iteration, effectively reclaiming one
buffer per scan and making scans very inefficient. This hurts vmscan
performance especially because mutex is acquired/released every time
dm_bufio_shrink_scan() is called.
New implementation uses ((LRU size - freed) <= retain_target)
condition for scan termination. LRU size can be safely determined
inside __scan() because this function is called after dm_bufio_lock().
2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to
determine number of freeable objects in the slab. However dm_bufio
always retains retain_target buffers in its LRU and will terminate
a scan when this mark is reached. Therefore returning the entire LRU size
from dm_bufio_shrink_count() is misleading because that does not
represent the number of freeable objects that slab will reclaim during
a scan. Returning (LRU size - retain_target) better represents the
number of freeable objects in the slab. This way do_shrink_slab()
returns 0 when (LRU size < retain_target) and vmscan will not try to
scan this shrinker avoiding scans that will not reclaim any memory.
Test: tested using Android device running
<AOSP>/system/extras/alloc-stress that generates memory pressure
and causes intensive shrinker scans
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-bufio.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1554,7 +1554,8 @@ static unsigned long __scan(struct dm_bu
int l;
struct dm_buffer *b, *tmp;
unsigned long freed = 0;
- unsigned long count = nr_to_scan;
+ unsigned long count = c->n_buffers[LIST_CLEAN] +
+ c->n_buffers[LIST_DIRTY];
unsigned long retain_target = get_retain_buffers(c);
for (l = 0; l < LIST_SIZE; l++) {
@@ -1591,6 +1592,7 @@ dm_bufio_shrink_count(struct shrinker *s
{
struct dm_bufio_client *c;
unsigned long count;
+ unsigned long retain_target;
c = container_of(shrink, struct dm_bufio_client, shrinker);
if (sc->gfp_mask & __GFP_FS)
@@ -1599,8 +1601,9 @@ dm_bufio_shrink_count(struct shrinker *s
return 0;
count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY];
+ retain_target = get_retain_buffers(c);
dm_bufio_unlock(c);
- return count;
+ return (count < retain_target) ? 0 : (count - retain_target);
}
/*
Patches currently in stable-queue which might be from surenb(a)google.com are
queue-4.9/dm-bufio-fix-shrinker-scans-when-nr_to_scan-retain_target.patch
This is a note to let you know that I've just added the patch titled
dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-bufio-fix-shrinker-scans-when-nr_to_scan-retain_target.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fbc7c07ec23c040179384a1f16b62b6030eb6bdd Mon Sep 17 00:00:00 2001
From: Suren Baghdasaryan <surenb(a)google.com>
Date: Wed, 6 Dec 2017 09:27:30 -0800
Subject: dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
From: Suren Baghdasaryan <surenb(a)google.com>
commit fbc7c07ec23c040179384a1f16b62b6030eb6bdd upstream.
When system is under memory pressure it is observed that dm bufio
shrinker often reclaims only one buffer per scan. This change fixes
the following two issues in dm bufio shrinker that cause this behavior:
1. ((nr_to_scan - freed) <= retain_target) condition is used to
terminate slab scan process. This assumes that nr_to_scan is equal
to the LRU size, which might not be correct because do_shrink_slab()
in vmscan.c calculates nr_to_scan using multiple inputs.
As a result when nr_to_scan is less than retain_target (64) the scan
will terminate after the first iteration, effectively reclaiming one
buffer per scan and making scans very inefficient. This hurts vmscan
performance especially because mutex is acquired/released every time
dm_bufio_shrink_scan() is called.
New implementation uses ((LRU size - freed) <= retain_target)
condition for scan termination. LRU size can be safely determined
inside __scan() because this function is called after dm_bufio_lock().
2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to
determine number of freeable objects in the slab. However dm_bufio
always retains retain_target buffers in its LRU and will terminate
a scan when this mark is reached. Therefore returning the entire LRU size
from dm_bufio_shrink_count() is misleading because that does not
represent the number of freeable objects that slab will reclaim during
a scan. Returning (LRU size - retain_target) better represents the
number of freeable objects in the slab. This way do_shrink_slab()
returns 0 when (LRU size < retain_target) and vmscan will not try to
scan this shrinker avoiding scans that will not reclaim any memory.
Test: tested using Android device running
<AOSP>/system/extras/alloc-stress that generates memory pressure
and causes intensive shrinker scans
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-bufio.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1527,7 +1527,8 @@ static unsigned long __scan(struct dm_bu
int l;
struct dm_buffer *b, *tmp;
unsigned long freed = 0;
- unsigned long count = nr_to_scan;
+ unsigned long count = c->n_buffers[LIST_CLEAN] +
+ c->n_buffers[LIST_DIRTY];
unsigned long retain_target = get_retain_buffers(c);
for (l = 0; l < LIST_SIZE; l++) {
@@ -1564,6 +1565,7 @@ dm_bufio_shrink_count(struct shrinker *s
{
struct dm_bufio_client *c;
unsigned long count;
+ unsigned long retain_target;
c = container_of(shrink, struct dm_bufio_client, shrinker);
if (sc->gfp_mask & __GFP_FS)
@@ -1572,8 +1574,9 @@ dm_bufio_shrink_count(struct shrinker *s
return 0;
count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY];
+ retain_target = get_retain_buffers(c);
dm_bufio_unlock(c);
- return count;
+ return (count < retain_target) ? 0 : (count - retain_target);
}
/*
Patches currently in stable-queue which might be from surenb(a)google.com are
queue-4.4/dm-bufio-fix-shrinker-scans-when-nr_to_scan-retain_target.patch
Hi Linux stable team,
ath10k has a replay detection problem which was fixed in v4.14. I would
like to get the fix also to linux-stable-4.9.y but for that it depends
on a small mac80211 patch. So then cherrypicking the fixes please take
the mac80211 commit first:
cef0acd4d7d4 mac80211: Add RX flag to indicate ICV stripped
7eccb738fce5 ath10k: rebuild crypto header in rx data frames
I tested and in this order commits apply just fine to linux-4.9.y.
The ath10k patch is largish but as this fixes a security issue I hope it
still can be applied to linux-stable. Please let me know if there are
any problems.
This is the commit log describing the problem:
--------------------------------------------------------------------------------
ath10k: rebuild crypto header in rx data frames
Rx data frames notified through HTT_T2H_MSG_TYPE_RX_IND and
HTT_T2H_MSG_TYPE_RX_FRAG_IND expect PN/TSC check to be done
on host (mac80211) rather than firmware. Rebuild cipher header
in every received data frames (that are notified through those
HTT interfaces) from the rx_hdr_status tlv available in the
rx descriptor of the first msdu. Skip setting RX_FLAG_IV_STRIPPED
flag for the packets which requires mac80211 PN/TSC check support
and set appropriate RX_FLAG for stripped crypto tail. Hw QCA988X,
QCA9887, QCA99X0, QCA9984, QCA9888 and QCA4019 currently need the
rebuilding of cipher header to perform PN/TSC check for replay
attack.
Please note that removing crypto tail for CCMP-256, GCMP and GCMP-256 ciphers
in raw mode needs to be fixed. Since Rx with these ciphers in raw
mode does not work in the current form even without this patch and
removing crypto tail for these chipers needs clean up, raw mode related
issues in CCMP-256, GCMP and GCMP-256 can be addressed in follow up
patches.
Tested-by: Manikanta Pubbisetty <mpubbise(a)qti.qualcomm.com>
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar(a)qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo(a)qca.qualcomm.com>
----------------------------------------------------------------------
--
Kalle Valo
Dear Kernel maintainers,
Could you please add commit id
fbc7c07ec23c040179384a1f16b62b6030eb6bdd from Linus's tree to stable
kernel trees?
Attached are the patches I prepared for stable 4.4, 4.9 and 4.14 branches.
For 4.4 and 4.19:
0001-BACKPORT-dm-bufio-fix-shrinker-scans-when-nr_to_scan-4.4-4.9.patch
For 4.14: 0001-BACKPORT-dm-bufio-fix-shrinker-scans-when-nr_to_scan-4.14.patch
Thanks,
Suren.
This is a note to let you know that I've just added the patch titled
dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-bufio-fix-shrinker-scans-when-nr_to_scan-retain_target.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fbc7c07ec23c040179384a1f16b62b6030eb6bdd Mon Sep 17 00:00:00 2001
From: Suren Baghdasaryan <surenb(a)google.com>
Date: Wed, 6 Dec 2017 09:27:30 -0800
Subject: dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
From: Suren Baghdasaryan <surenb(a)google.com>
commit fbc7c07ec23c040179384a1f16b62b6030eb6bdd upstream.
When system is under memory pressure it is observed that dm bufio
shrinker often reclaims only one buffer per scan. This change fixes
the following two issues in dm bufio shrinker that cause this behavior:
1. ((nr_to_scan - freed) <= retain_target) condition is used to
terminate slab scan process. This assumes that nr_to_scan is equal
to the LRU size, which might not be correct because do_shrink_slab()
in vmscan.c calculates nr_to_scan using multiple inputs.
As a result when nr_to_scan is less than retain_target (64) the scan
will terminate after the first iteration, effectively reclaiming one
buffer per scan and making scans very inefficient. This hurts vmscan
performance especially because mutex is acquired/released every time
dm_bufio_shrink_scan() is called.
New implementation uses ((LRU size - freed) <= retain_target)
condition for scan termination. LRU size can be safely determined
inside __scan() because this function is called after dm_bufio_lock().
2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to
determine number of freeable objects in the slab. However dm_bufio
always retains retain_target buffers in its LRU and will terminate
a scan when this mark is reached. Therefore returning the entire LRU size
from dm_bufio_shrink_count() is misleading because that does not
represent the number of freeable objects that slab will reclaim during
a scan. Returning (LRU size - retain_target) better represents the
number of freeable objects in the slab. This way do_shrink_slab()
returns 0 when (LRU size < retain_target) and vmscan will not try to
scan this shrinker avoiding scans that will not reclaim any memory.
Test: tested using Android device running
<AOSP>/system/extras/alloc-stress that generates memory pressure
and causes intensive shrinker scans
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-bufio.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1611,7 +1611,8 @@ static unsigned long __scan(struct dm_bu
int l;
struct dm_buffer *b, *tmp;
unsigned long freed = 0;
- unsigned long count = nr_to_scan;
+ unsigned long count = c->n_buffers[LIST_CLEAN] +
+ c->n_buffers[LIST_DIRTY];
unsigned long retain_target = get_retain_buffers(c);
for (l = 0; l < LIST_SIZE; l++) {
@@ -1647,8 +1648,11 @@ static unsigned long
dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
{
struct dm_bufio_client *c = container_of(shrink, struct dm_bufio_client, shrinker);
+ unsigned long count = ACCESS_ONCE(c->n_buffers[LIST_CLEAN]) +
+ ACCESS_ONCE(c->n_buffers[LIST_DIRTY]);
+ unsigned long retain_target = get_retain_buffers(c);
- return ACCESS_ONCE(c->n_buffers[LIST_CLEAN]) + ACCESS_ONCE(c->n_buffers[LIST_DIRTY]);
+ return (count < retain_target) ? 0 : (count - retain_target);
}
/*
Patches currently in stable-queue which might be from surenb(a)google.com are
queue-4.14/dm-bufio-fix-shrinker-scans-when-nr_to_scan-retain_target.patch