The normal request completion can be done before or during handling
BLK_EH_RESET_TIMER, and this race may cause the request to never
be completed since driver's .timeout() may always return
BLK_EH_RESET_TIMER.
This issue can't be fixed completely by driver, since the normal
completion can be done between returning .timeout() and handing
BLK_EH_RESET_TIMER.
This patch fixes this race by introducing rq state of MQ_RQ_COMPLETE_IN_RESET,
and reading/writing rq's state by holding queue lock, which can be
per-request actually, but just not necessary to introduce one lock for
so unusual event.
Cc: Bart Van Assche <bart.vanassche(a)wdc.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Ming Lei <ming.lei(a)redhat.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Israel Rukshin <israelr(a)mellanox.com>,
Cc: Max Gurtovoy <maxg(a)mellanox.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
This is another way to fix this long-time issue, and turns out this
solution is much simpler.
block/blk-mq.c | 44 +++++++++++++++++++++++++++++++++++++++-----
block/blk-mq.h | 1 +
2 files changed, 40 insertions(+), 5 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0dc9e341c2a7..12e8850e3905 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -630,10 +630,27 @@ void blk_mq_complete_request(struct request *rq)
* However, that would complicate paths which want to synchronize
* against us. Let stay in sync with the issue path so that
* hctx_lock() covers both issue and completion paths.
+ *
+ * Cover complete vs BLK_EH_RESET_TIMER race in slow path with
+ * helding queue lock.
*/
hctx_lock(hctx, &srcu_idx);
if (blk_mq_rq_aborted_gstate(rq) != rq->gstate)
__blk_mq_complete_request(rq);
+ else {
+ unsigned long flags;
+ bool need_complete = false;
+
+ spin_lock_irqsave(q->queue_lock, flags);
+ if (!blk_mq_rq_aborted_gstate(rq))
+ need_complete = true;
+ else
+ blk_mq_rq_update_state(rq, MQ_RQ_COMPLETE_IN_RESET);
+ spin_unlock_irqrestore(q->queue_lock, flags);
+
+ if (need_complete)
+ __blk_mq_complete_request(rq);
+ }
hctx_unlock(hctx, srcu_idx);
}
EXPORT_SYMBOL(blk_mq_complete_request);
@@ -814,24 +831,41 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
{
const struct blk_mq_ops *ops = req->q->mq_ops;
enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
+ unsigned long flags;
req->rq_flags |= RQF_MQ_TIMEOUT_EXPIRED;
if (ops->timeout)
ret = ops->timeout(req, reserved);
+again:
switch (ret) {
case BLK_EH_HANDLED:
__blk_mq_complete_request(req);
break;
case BLK_EH_RESET_TIMER:
/*
- * As nothing prevents from completion happening while
- * ->aborted_gstate is set, this may lead to ignored
- * completions and further spurious timeouts.
+ * The normal completion may happen during handling the
+ * timeout, or even after returning from .timeout(), so
+ * once the request has been completed, we can't reset
+ * timer any more since this request may be handled as
+ * BLK_EH_RESET_TIMER in next timeout handling too, and
+ * it has to be completed in this situation.
+ *
+ * Holding the queue lock to cover read/write rq's
+ * aborted_gstate and normal state, so the race can be
+ * avoided completely.
*/
- blk_mq_rq_update_aborted_gstate(req, 0);
- blk_add_timer(req);
+ spin_lock_irqsave(req->q->queue_lock, flags);
+ if (blk_mq_rq_state(req) != MQ_RQ_COMPLETE_IN_RESET) {
+ blk_mq_rq_update_aborted_gstate(req, 0);
+ blk_add_timer(req);
+ } else {
+ blk_mq_rq_update_state(req, MQ_RQ_IN_FLIGHT);
+ ret = BLK_EH_HANDLED;
+ goto again;
+ }
+ spin_unlock_irqrestore(req->q->queue_lock, flags);
break;
case BLK_EH_NOT_HANDLED:
break;
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 88c558f71819..6dc242fc785a 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -35,6 +35,7 @@ enum mq_rq_state {
MQ_RQ_IDLE = 0,
MQ_RQ_IN_FLIGHT = 1,
MQ_RQ_COMPLETE = 2,
+ MQ_RQ_COMPLETE_IN_RESET = 3,
MQ_RQ_STATE_BITS = 2,
MQ_RQ_STATE_MASK = (1 << MQ_RQ_STATE_BITS) - 1,
--
2.9.5
Hi,
Not sure if you received my email from last week.
We offer following image editing services:
images cutting out, clipping path, masking
jewelry photos retouching
beauty photos retouching
also wedding photos etc
If you want to test our quality of work.
You may send us one photo with instruction and we will work on it.
Hope to hear from you soon.
Regards,
Ross
The Studio Manager
When event on child inodes are sent to the parent inode mark and
parent inode mark was not marked with FAN_EVENT_ON_CHILD, the event
will not be delivered to the listener process. However, if the same
process also has a mount mark, the event to the parent inode will be
delivered regadless of the mount mark mask.
This behavior is incorrect in the case where the mount mark mask does
not contain the specific event type. For example, the process adds
a mark on a directory with mask FAN_MODIFY (without FAN_EVENT_ON_CHILD)
and a mount mark with mask FAN_CLOSE_NOWRITE (without FAN_ONDIR).
A modify event on a file inside that directory (and inside that mount)
should not create a FAN_MODIFY event, because neither of the marks
requested to get that event on the file.
Fixes: 1968f5eed54c ("fanotify: use both marks when possible")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
---
fs/notify/fanotify/fanotify.c | 34 +++++++++++++++-------------------
1 file changed, 15 insertions(+), 19 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 6702a6a0bbb5..e0e6a9d627df 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -92,7 +92,7 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark,
u32 event_mask,
const void *data, int data_type)
{
- __u32 marks_mask, marks_ignored_mask;
+ __u32 marks_mask = 0, marks_ignored_mask = 0;
const struct path *path = data;
pr_debug("%s: inode_mark=%p vfsmnt_mark=%p mask=%x data=%p"
@@ -108,24 +108,20 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark,
!d_can_lookup(path->dentry))
return false;
- if (inode_mark && vfsmnt_mark) {
- marks_mask = (vfsmnt_mark->mask | inode_mark->mask);
- marks_ignored_mask = (vfsmnt_mark->ignored_mask | inode_mark->ignored_mask);
- } else if (inode_mark) {
- /*
- * if the event is for a child and this inode doesn't care about
- * events on the child, don't send it!
- */
- if ((event_mask & FS_EVENT_ON_CHILD) &&
- !(inode_mark->mask & FS_EVENT_ON_CHILD))
- return false;
- marks_mask = inode_mark->mask;
- marks_ignored_mask = inode_mark->ignored_mask;
- } else if (vfsmnt_mark) {
- marks_mask = vfsmnt_mark->mask;
- marks_ignored_mask = vfsmnt_mark->ignored_mask;
- } else {
- BUG();
+ /*
+ * if the event is for a child and this inode doesn't care about
+ * events on the child, don't send it!
+ */
+ if (inode_mark &&
+ (!(event_mask & FS_EVENT_ON_CHILD) ||
+ (inode_mark->mask & FS_EVENT_ON_CHILD))) {
+ marks_mask |= inode_mark->mask;
+ marks_ignored_mask |= inode_mark->ignored_mask;
+ }
+
+ if (vfsmnt_mark) {
+ marks_mask |= vfsmnt_mark->mask;
+ marks_ignored_mask |= vfsmnt_mark->ignored_mask;
}
if (d_is_dir(path->dentry) &&
--
2.7.4
Stable team, please backport commit
a306343bcd7d ("drm/i915/edp: Do not do link training fallback or prune modes on EDP")
to v4.13+
Fixes: 9301397a63b3 ("drm/i915: Implement Link Rate fallback on Link training failure")
BR,
Jani.
--
Jani Nikula, Intel Open Source Technology Center
cache_reap() is initially scheduled in start_cpu_timer() via
schedule_delayed_work_on(). But then the next iterations are scheduled via
schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.
Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on
wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run
on the originally intended cpu, although it's still preferred. I was able to
demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu.
IIUC, it may also happen due to migrating timers in nohz context. As a result,
some cpu's would be calling cache_reap() more frequently and others never.
This patch uses schedule_delayed_work_on() with the current cpu when scheduling
the next iteration.
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs")
CC: <stable(a)vger.kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Lai Jiangshan <jiangshanlai(a)gmail.com>
Cc: John Stultz <john.stultz(a)linaro.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Stephen Boyd <sboyd(a)kernel.org>
---
mm/slab.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/slab.c b/mm/slab.c
index 9095c3945425..a76006aae857 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4074,7 +4074,8 @@ static void cache_reap(struct work_struct *w)
next_reap_node();
out:
/* Set up the next iteration */
- schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
+ schedule_delayed_work_on(smp_processor_id(), work,
+ round_jiffies_relative(REAPTIMEOUT_AC));
}
void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo)
--
2.16.3
From: Takashi Iwai <tiwai(a)suse.de>
[ Upstream commit d7f910bfedd863d13ea320030fe98e42d0938ed5 ]
For accessing the snd_timer_user queue indices, we take tu->qlock.
But it's forgotten in a couple of places.
The one in snd_timer_user_params() should be safe without the
spinlock as the timer is already stopped. But it's better for
consistency.
The one in poll is just a read-out, so it's not inevitably needed, but
it'd be good to make the result consistent, too.
Tested-by: Alexander Potapenko <glider(a)google.com>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
sound/core/timer.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/sound/core/timer.c b/sound/core/timer.c
index e5ddc475dca4..bbde1bcdd985 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -1773,6 +1773,7 @@ static int snd_timer_user_params(struct file *file,
}
}
}
+ spin_lock_irq(&tu->qlock);
tu->qhead = tu->qtail = tu->qused = 0;
if (tu->timeri->flags & SNDRV_TIMER_IFLG_EARLY_EVENT) {
if (tu->tread) {
@@ -1793,6 +1794,7 @@ static int snd_timer_user_params(struct file *file,
}
tu->filter = params.filter;
tu->ticks = params.ticks;
+ spin_unlock_irq(&tu->qlock);
err = 0;
_end:
if (copy_to_user(_params, ¶ms, sizeof(params)))
@@ -2034,10 +2036,12 @@ static unsigned int snd_timer_user_poll(struct file *file, poll_table * wait)
poll_wait(file, &tu->qchange_sleep, wait);
mask = 0;
+ spin_lock_irq(&tu->qlock);
if (tu->qused)
mask |= POLLIN | POLLRDNORM;
if (tu->disconnected)
mask |= POLLERR;
+ spin_unlock_irq(&tu->qlock);
return mask;
}
--
2.15.1
The patch titled
Subject: resource: fix integer overflow at reallocation
has been removed from the -mm tree. Its filename was
resource-fix-integer-overflow-at-reallocation.patch
This patch was dropped because an alternative patch was merged
------------------------------------------------------
From: Takashi Iwai <tiwai(a)suse.de>
Subject: resource: fix integer overflow at reallocation
We've got a bug report indicating a kernel panic at booting on an x86-32
system, and it turned out to be the invalid resource assigned after PCI
resource reallocation. __find_resource() first aligns the resource start
address and resets the end address with start+size-1 accordingly, then
checks whether it's contained. Here the end address may overflow the
integer, although resource_contains() still returns true because the
function validates only start and end address. So this ends up with
returning an invalid resource (start > end).
There was already an attempt to cover such a problem in the commit
47ea91b4052d ("Resource: fix wrong resource window calculation"), but this
case is an overseen one.
This patch adds the validity check in resource_contains() to see whether
the given resource has a valid range for avoiding the integer overflow
problem.
Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739
Link: http://lkml.kernel.org/r/20180408072026.27365-1-tiwai@suse.de
Fixes: 23c570a67448 ("resource: ability to resize an allocated resource")
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Reported-by: Michael Henders <hendersm(a)shaw.ca>
Tested-by: Michael Henders <hendersm(a)shaw.ca>
Reviewed-by: Ram Pai <linuxram(a)us.ibm.com>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/ioport.h | 3 +++
1 file changed, 3 insertions(+)
diff -puN include/linux/ioport.h~resource-fix-integer-overflow-at-reallocation include/linux/ioport.h
--- a/include/linux/ioport.h~resource-fix-integer-overflow-at-reallocation
+++ a/include/linux/ioport.h
@@ -212,6 +212,9 @@ static inline bool resource_contains(str
return false;
if (r1->flags & IORESOURCE_UNSET || r2->flags & IORESOURCE_UNSET)
return false;
+ /* sanity check whether it's a valid resource range */
+ if (r2->end < r2->start)
+ return false;
return r1->start <= r2->start && r1->end >= r2->end;
}
_
Patches currently in -mm which might be from tiwai(a)suse.de are
resource-fix-integer-overflow-at-reallocation-v1.patch