According to the official documentation for HFS+ [1], inode timestamps
are supposed to cover the time range from 1904 to 2040 as originally
used in classic MacOS.
The traditional Linux usage is to convert the timestamps into an unsigned
32-bit number based on the Unix epoch and from there to a time_t. On
32-bit systems, that wraps the time from 2038 to 1902, so the last
two years of the valid time range become garbled. On 64-bit systems,
all times before 1970 get turned into timestamps between 2038 and 2106,
which is more convenient but also different from the documented behavior.
Looking at the Darwin sources [2], it seems that MacOS is inconsistent in
yet another way: all timestamps are wrapped around to a 32-bit unsigned
number when written to the disk, but when read back, all numeric values
lower than 2082844800U are assumed to be invalid, so we cannot represent
the times before 1970 or the times after 2040.
While all implementations seem to agree on the interpretation of values
between 1970 and 2038, they often differ on the exact range they support
when reading back values outside of the common range:
MacOS (traditional): 1904-2040
Apple Documentation: 1904-2040
MacOS X source comments: 1970-2040
MacOS X source code: 1970-2038
32-bit Linux: 1902-2038
64-bit Linux: 1970-2106
hfsfuse: 1970-2040
hfsutils (32 bit, old libc) 1902-2038
hfsutils (32 bit, new libc) 1970-2106
hfsutils (64 bit) 1904-2040
hfsplus-utils 1904-2040
hfsexplorer 1904-2040
7-zip 1904-2040
This changes Linux over to mostly the same behavior as described in the
code comment in MacOS X, disallowing all times before 1970 and after
2040, while still allowing times between 2038 and 2040 like most other
implementations do. Most importantly, it means we can have the same
behavior on 32-bit and 64-bit.
Cc: stable(a)vger.kernel.org
Link: [1] https://developer.apple.com/library/archive/technotes/tn/tn1150.html
Link: [2] https://opensource.apple.com/source/hfs/hfs-407.30.1/core/MacOSStubs.c.auto…
Suggested-by: Viacheslav Dubeyko <slava(a)dubeyko.com>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
v2: treat pre-1970 dates as invalid following MacOS X behavior,
reword and expand changelog text
---
fs/hfs/hfs_fs.h | 29 +++++++++++++++++++++++++----
fs/hfsplus/hfsplus_fs.h | 26 +++++++++++++++++++++++---
2 files changed, 48 insertions(+), 7 deletions(-)
diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
index 6d0783e2e276..1af998fb522e 100644
--- a/fs/hfs/hfs_fs.h
+++ b/fs/hfs/hfs_fs.h
@@ -246,14 +246,35 @@ extern void hfs_mark_mdb_dirty(struct super_block *sb);
* mac: unsigned big-endian since 00:00 GMT, Jan. 1, 1904
*
*/
-#define __hfs_u_to_mtime(sec) cpu_to_be32(sec + 2082844800U - sys_tz.tz_minuteswest * 60)
-#define __hfs_m_to_utime(sec) (be32_to_cpu(sec) - 2082844800U + sys_tz.tz_minuteswest * 60)
+static inline time64_t __hfs_m_to_utime(__be32 mt)
+{
+ time64_t ut = (u32)(be32_to_cpu(mt) - 2082844800U);
+
+ /*
+ * Times past 2040-02-06 06:28 are assumed to be invalid,
+ * matching the MacOS behavior.
+ */
+ if (ut > 2082844800U + UINT_MAX)
+ ut = 0;
+
+ return ut + sys_tz.tz_minuteswest * 60;
+}
+static inline __be32 __hfs_u_to_mtime(time64_t ut)
+{
+ ut -= - sys_tz.tz_minuteswest * 60;
+
+ /*
+ * MacOS wraps "invalid" times after 2040 when writing back, so
+ * let's do the same here.
+ */
+ return cpu_to_be32(lower_32_bits(ut + 2082844800U));
+}
#define HFS_I(inode) (container_of(inode, struct hfs_inode_info, vfs_inode))
#define HFS_SB(sb) ((struct hfs_sb_info *)(sb)->s_fs_info)
-#define hfs_m_to_utime(time) (struct timespec){ .tv_sec = __hfs_m_to_utime(time) }
-#define hfs_u_to_mtime(time) __hfs_u_to_mtime((time).tv_sec)
+#define hfs_m_to_utime(time) (struct timespec){ .tv_sec = __hfs_m_to_utime(time) }
+#define hfs_u_to_mtime(time) __hfs_u_to_mtime((time).tv_sec)
#define hfs_mtime() __hfs_u_to_mtime(get_seconds())
static inline const char *hfs_mdb_name(struct super_block *sb)
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index d9255abafb81..7f0943e540a0 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -530,9 +530,29 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t sector, void *buf,
void **data, int op, int op_flags);
int hfsplus_read_wrapper(struct super_block *sb);
-/* time macros */
-#define __hfsp_mt2ut(t) (be32_to_cpu(t) - 2082844800U)
-#define __hfsp_ut2mt(t) (cpu_to_be32(t + 2082844800U))
+/* time helpers */
+static inline time64_t __hfsp_mt2ut(__be32 mt)
+{
+ time64_t ut = (u32)(be32_to_cpu(mt) - 2082844800U);
+
+ /*
+ * Times past 2040-02-06 06:28 are assumed to be invalid,
+ * matching the MacOS behavior.
+ */
+ if (ut > 2082844800U + UINT_MAX)
+ ut = 0;
+
+ return ut;
+}
+
+static inline __be32 __hfsp_ut2mt(time64_t ut)
+{
+ /*
+ * MacOS wraps "invalid" times after 2040 when writing back, so
+ * let's do the same here.
+ */
+ return cpu_to_be32(lower_32_bits(ut + 2082844800U));
+}
/* compatibility */
#define hfsp_mt2ut(t) (struct timespec){ .tv_sec = __hfsp_mt2ut(t) }
--
2.9.0
Inherit the tracing on/off setting on ring_buffer to next
trace buffer when taking a snapshot.
Taking a snapshot is done by swapping with backup ring buffer
(max_tr_buffer). But since the tracing on/off setting is set
in the ring buffer, when swapping it, tracing on/off setting
can also be changed. This causes a strange result like below;
/sys/kernel/debug/tracing # cat tracing_on
1
/sys/kernel/debug/tracing # echo 0 > tracing_on
/sys/kernel/debug/tracing # echo 1 > snapshot
/sys/kernel/debug/tracing # cat tracing_on
1
/sys/kernel/debug/tracing # echo 1 > snapshot
/sys/kernel/debug/tracing # cat tracing_on
0
We don't touch tracing_on, but snapshot changes tracing_on
setting each time. This must be a bug, because user never know
that each "ring_buffer" stores tracing-enable state and
snapshot is done by swapping ring buffers.
This patch fixes above strange behavior.
Fixes: commit debdd57f5145 ("tracing: Make a snapshot feature available from userspace")
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Hiraku Toyooka <hiraku.toyooka(a)cybertrust.co.jp>
Cc: stable(a)vger.kernel.org
---
include/linux/ring_buffer.h | 1 +
kernel/trace/ring_buffer.c | 12 ++++++++++++
kernel/trace/trace.c | 6 ++++++
3 files changed, 19 insertions(+)
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index b72ebdff0b77..003d09ab308d 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -165,6 +165,7 @@ void ring_buffer_record_enable(struct ring_buffer *buffer);
void ring_buffer_record_off(struct ring_buffer *buffer);
void ring_buffer_record_on(struct ring_buffer *buffer);
int ring_buffer_record_is_on(struct ring_buffer *buffer);
+int ring_buffer_record_is_set_on(struct ring_buffer *buffer);
void ring_buffer_record_disable_cpu(struct ring_buffer *buffer, int cpu);
void ring_buffer_record_enable_cpu(struct ring_buffer *buffer, int cpu);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 6a46af21765c..4038ed74ab95 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3227,6 +3227,18 @@ int ring_buffer_record_is_on(struct ring_buffer *buffer)
}
/**
+ * ring_buffer_record_is_set_on - return true if the ring buffer is set writable
+ * @buffer: The ring buffer to see if write is set enabled
+ *
+ * Returns true if the ring buffer is set writable by ring_buffer_record_on().
+ * Note that this does NOT mean it is in a writable state.
+ */
+int ring_buffer_record_is_set_on(struct ring_buffer *buffer)
+{
+ return !(atomic_read(&buffer->record_disabled) & RB_BUFFER_OFF);
+}
+
+/**
* ring_buffer_record_disable_cpu - stop all writes into the cpu_buffer
* @buffer: The ring buffer to stop writes to.
* @cpu: The CPU buffer to stop
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2556d8c097d2..bbd5a94a7ef1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1378,6 +1378,12 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
arch_spin_lock(&tr->max_lock);
+ /* Inherit the recordable setting from trace_buffer */
+ if (ring_buffer_record_is_set_on(tr->trace_buffer.buffer))
+ ring_buffer_record_on(tr->max_buffer.buffer);
+ else
+ ring_buffer_record_off(tr->max_buffer.buffer);
+
swap(tr->trace_buffer.buffer, tr->max_buffer.buffer);
__update_max_tr(tr, tsk, cpu);
The existing code to carve up the sg list expected an sg element-per-page
which can be very incorrect with iommu's remapping multiple memory pages
to fewer bus addresses. To hit this error required a large io payload
(greater than 256k) and a system that maps on a per-page basis. It's
possible that large ios could get by fine if the system condensed the
sgl list into the first 64 elements.
This patch corrects the sg list handling by specifically walking the
sg list element by element and attempting to divide the transfer up
on a per-sg element boundary. While doing so, it still tries to keep
sequences under 256k, but will exceed that rule if a single sg element
is larger than 256k.
Fixes: 48fa362b6c3f ("nvmet-fc: simplify sg list handling")
Cc: <stable(a)vger.kernel.org> # 4.14
Signed-off-by: James Smart <james.smart(a)broadcom.com>
---
drivers/nvme/target/fc.c | 44 +++++++++++++++++++++++++++++++++++---------
1 file changed, 35 insertions(+), 9 deletions(-)
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 408279cb6f2c..29b4b236afd8 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -58,8 +58,8 @@ struct nvmet_fc_ls_iod {
struct work_struct work;
} __aligned(sizeof(unsigned long long));
+/* desired maximum for a single sequence - if sg list allows it */
#define NVMET_FC_MAX_SEQ_LENGTH (256 * 1024)
-#define NVMET_FC_MAX_XFR_SGENTS (NVMET_FC_MAX_SEQ_LENGTH / PAGE_SIZE)
enum nvmet_fcp_datadir {
NVMET_FCP_NODATA,
@@ -74,6 +74,7 @@ struct nvmet_fc_fcp_iod {
struct nvme_fc_cmd_iu cmdiubuf;
struct nvme_fc_ersp_iu rspiubuf;
dma_addr_t rspdma;
+ struct scatterlist *next_sg;
struct scatterlist *data_sg;
int data_sg_cnt;
u32 offset;
@@ -1025,8 +1026,7 @@ nvmet_fc_register_targetport(struct nvmet_fc_port_info *pinfo,
INIT_LIST_HEAD(&newrec->assoc_list);
kref_init(&newrec->ref);
ida_init(&newrec->assoc_cnt);
- newrec->max_sg_cnt = min_t(u32, NVMET_FC_MAX_XFR_SGENTS,
- template->max_sgl_segments);
+ newrec->max_sg_cnt = template->max_sgl_segments;
ret = nvmet_fc_alloc_ls_iodlist(newrec);
if (ret) {
@@ -1722,6 +1722,7 @@ nvmet_fc_alloc_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
((fod->io_dir == NVMET_FCP_WRITE) ?
DMA_FROM_DEVICE : DMA_TO_DEVICE));
/* note: write from initiator perspective */
+ fod->next_sg = fod->data_sg;
return 0;
@@ -1866,24 +1867,49 @@ nvmet_fc_transfer_fcp_data(struct nvmet_fc_tgtport *tgtport,
struct nvmet_fc_fcp_iod *fod, u8 op)
{
struct nvmefc_tgt_fcp_req *fcpreq = fod->fcpreq;
+ struct scatterlist *sg = fod->next_sg;
unsigned long flags;
- u32 tlen;
+ u32 remaininglen = fod->req.transfer_len - fod->offset;
+ u32 tlen = 0;
int ret;
fcpreq->op = op;
fcpreq->offset = fod->offset;
fcpreq->timeout = NVME_FC_TGTOP_TIMEOUT_SEC;
- tlen = min_t(u32, tgtport->max_sg_cnt * PAGE_SIZE,
- (fod->req.transfer_len - fod->offset));
+ /*
+ * for next sequence:
+ * break at a sg element boundary
+ * attempt to keep sequence length capped at
+ * NVMET_FC_MAX_SEQ_LENGTH but allow sequence to
+ * be longer if a single sg element is larger
+ * than that amount. This is done to avoid creating
+ * a new sg list to use for the tgtport api.
+ */
+ fcpreq->sg = sg;
+ fcpreq->sg_cnt = 0;
+ while (tlen < remaininglen &&
+ fcpreq->sg_cnt < tgtport->max_sg_cnt &&
+ tlen + sg_dma_len(sg) < NVMET_FC_MAX_SEQ_LENGTH) {
+ fcpreq->sg_cnt++;
+ tlen += sg_dma_len(sg);
+ sg = sg_next(sg);
+ }
+ if (tlen < remaininglen && fcpreq->sg_cnt == 0) {
+ fcpreq->sg_cnt++;
+ tlen += min_t(u32, sg_dma_len(sg), remaininglen);
+ sg = sg_next(sg);
+ }
+ if (tlen < remaininglen)
+ fod->next_sg = sg;
+ else
+ fod->next_sg = NULL;
+
fcpreq->transfer_length = tlen;
fcpreq->transferred_length = 0;
fcpreq->fcp_error = 0;
fcpreq->rsplen = 0;
- fcpreq->sg = &fod->data_sg[fod->offset / PAGE_SIZE];
- fcpreq->sg_cnt = DIV_ROUND_UP(tlen, PAGE_SIZE);
-
/*
* If the last READDATA request: check if LLDD supports
* combined xfr with response.
--
2.13.1
I would like to speak with the person that managing photos for your
company?
We provide image editing like – photos cutting out and retouching.
Enhancing your images is just a part of what we can do for your business.
Whether you’re an ecommerce
store or portrait photographer, real estate professional, or an e-Retailer,
we are your personal team
of photo editors that integrate seamlessly with your business.
Our mainly services are:
. Cut out, masking, clipping path, deep etching, transparent background
. Colour correction, black and white, light and shadows etc.
. Dust cleaning, spot cleaning
. Beauty retouching, skin retouching, face retouching, body retouching
. Fashion/Beauty Image Retouching
. Product image Retouching
. Real estate image Retouching
. Wedding & Event Album Design.
. Restoration and repair old images
. Vector Conversion
. Portrait image Retouching
We can provide you editing test on your photos.
Please reply if you are interested.
Thanks,
Roland
hrtimer_cancel() busy-waits for the hrtimer callback to stop,
pretty much like del_timer_sync(). This creates a possible deadlock
scenario where we hold a spinlock before calling hrtimer_cancel()
while in trying to acquire the same spinlock in the callback.
This kind of deadlock is already known and is catchable by lockdep,
like for del_timer_sync(), we can add lockdep annotations. However,
it is still missing for hrtimer_cancel(). (I have a WIP patch to make
it complete for hrtimer_cancel() but it breaks booting.)
And there is such a deadlock scenario in kernel/events/core.c too,
well actually, it is a simpler version: the hrtimer callback waits
for itself to finish on the same CPU! It sounds stupid but it is
not obvious at all, it hides very deeply in the perf event code:
cpu_clock_event_init():
perf_swevent_init_hrtimer():
hwc->hrtimer.function = perf_swevent_hrtimer;
perf_swevent_hrtimer():
__perf_event_overflow():
__perf_event_account_interrupt():
perf_adjust_period():
pmu->stop():
cpu_clock_event_stop():
perf_swevent_cancel():
hrtimer_cancel()
Getting stuck in a timer doesn't sound very scary, however, in this
case, its consequences are a disaster:
perf_event_overflow() which calls __perf_event_overflow() is called
in NMI handler too, so it is racy with hrtimer callback as disabling
IRQ can't possibly disable NMI. This means this hrtimer callback
once interrupted by an NMI handler could deadlock within NMI!
As a further consequence, other IRQ handling is blocked too, notably
the IPI handler, especially when smp_call_function_*() waits for their
callbacks synchronously. This is why we saw so many soft lockup's in
smp_call_function_single() given how widely they are used in kernel.
Ironically, perf event code uses synchronous smp_call_function_single()
heavily too.
The fix is not easy. To minimize the impact, ideally we should just
avoid busy waiting when it is called within the hrtimer callback on
the same CPU, there is no reason to wait for itself to finish anyway.
Probably it doesn't even need to cancel itself either since it will
restart by pmu->start() later. There are two possible fixes here:
1. Modify hrtimer API to detect if a hrtimer callback is running
on the same CPU now. This does not look pretty though.
2. Passing some information from perf_swevent_hrtimer() down to
perf_swevent_cancel().
So I pick the latter approach, it is simple and straightforward.
Note, currently perf_swevent_hrtimer() still races with
perf_event_overflow() in NMI on the same CPU anyway, given there is no
lock around and probably locking does not even help. But it is nothing
new, and the race itself is not bad either, at most we have some
inconsistent updates on the event sample period.
Fixes: abd50713944c ("perf: Reimplement frequency driven sampling")
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
---
include/linux/perf_event.h | 3 +++
kernel/events/core.c | 43 +++++++++++++++++++++++++++----------------
2 files changed, 30 insertions(+), 16 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 1fa12887ec02..aab39b8aa720 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -310,6 +310,9 @@ struct pmu {
#define PERF_EF_START 0x01 /* start the counter when adding */
#define PERF_EF_RELOAD 0x02 /* reload the counter when starting */
#define PERF_EF_UPDATE 0x04 /* update the counter when stopping */
+#define PERF_EF_NO_WAIT 0x08 /* do not wait when stopping, for
+ * example, waiting for a timer
+ */
/*
* Adds/Removes a counter to/from the PMU, can be done inside a
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8f0434a9951a..f15832346b35 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3555,7 +3555,8 @@ do { \
static DEFINE_PER_CPU(int, perf_throttled_count);
static DEFINE_PER_CPU(u64, perf_throttled_seq);
-static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count, bool disable)
+static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count,
+ bool disable, bool nowait)
{
struct hw_perf_event *hwc = &event->hw;
s64 period, sample_period;
@@ -3574,8 +3575,13 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count, bo
hwc->sample_period = sample_period;
if (local64_read(&hwc->period_left) > 8*sample_period) {
- if (disable)
- event->pmu->stop(event, PERF_EF_UPDATE);
+ if (disable) {
+ int flags = PERF_EF_UPDATE;
+
+ if (nowait)
+ flags |= PERF_EF_NO_WAIT;
+ event->pmu->stop(event, flags);
+ }
local64_set(&hwc->period_left, 0);
@@ -3645,7 +3651,7 @@ static void perf_adjust_freq_unthr_context(struct perf_event_context *ctx,
* twice.
*/
if (delta > 0)
- perf_adjust_period(event, period, delta, false);
+ perf_adjust_period(event, period, delta, false, false);
event->pmu->start(event, delta > 0 ? PERF_EF_RELOAD : 0);
next:
@@ -7681,7 +7687,8 @@ static void perf_log_itrace_start(struct perf_event *event)
}
static int
-__perf_event_account_interrupt(struct perf_event *event, int throttle)
+__perf_event_account_interrupt(struct perf_event *event, int throttle,
+ bool nowait)
{
struct hw_perf_event *hwc = &event->hw;
int ret = 0;
@@ -7710,7 +7717,8 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
hwc->freq_time_stamp = now;
if (delta > 0 && delta < 2*TICK_NSEC)
- perf_adjust_period(event, delta, hwc->last_period, true);
+ perf_adjust_period(event, delta, hwc->last_period, true,
+ nowait);
}
return ret;
@@ -7718,7 +7726,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
int perf_event_account_interrupt(struct perf_event *event)
{
- return __perf_event_account_interrupt(event, 1);
+ return __perf_event_account_interrupt(event, 1, false);
}
/*
@@ -7727,7 +7735,7 @@ int perf_event_account_interrupt(struct perf_event *event)
static int __perf_event_overflow(struct perf_event *event,
int throttle, struct perf_sample_data *data,
- struct pt_regs *regs)
+ struct pt_regs *regs, bool nowait)
{
int events = atomic_read(&event->event_limit);
int ret = 0;
@@ -7739,7 +7747,7 @@ static int __perf_event_overflow(struct perf_event *event,
if (unlikely(!is_sampling_event(event)))
return 0;
- ret = __perf_event_account_interrupt(event, throttle);
+ ret = __perf_event_account_interrupt(event, throttle, nowait);
/*
* XXX event_limit might not quite work as expected on inherited
@@ -7768,7 +7776,7 @@ int perf_event_overflow(struct perf_event *event,
struct perf_sample_data *data,
struct pt_regs *regs)
{
- return __perf_event_overflow(event, 1, data, regs);
+ return __perf_event_overflow(event, 1, data, regs, true);
}
/*
@@ -7831,7 +7839,7 @@ static void perf_swevent_overflow(struct perf_event *event, u64 overflow,
for (; overflow; overflow--) {
if (__perf_event_overflow(event, throttle,
- data, regs)) {
+ data, regs, false)) {
/*
* We inhibit the overflow from happening when
* hwc->interrupts == MAX_INTERRUPTS.
@@ -9110,7 +9118,7 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer)
if (regs && !perf_exclude_event(event, regs)) {
if (!(event->attr.exclude_idle && is_idle_task(current)))
- if (__perf_event_overflow(event, 1, &data, regs))
+ if (__perf_event_overflow(event, 1, &data, regs, true))
ret = HRTIMER_NORESTART;
}
@@ -9141,7 +9149,7 @@ static void perf_swevent_start_hrtimer(struct perf_event *event)
HRTIMER_MODE_REL_PINNED);
}
-static void perf_swevent_cancel_hrtimer(struct perf_event *event)
+static void perf_swevent_cancel_hrtimer(struct perf_event *event, bool sync)
{
struct hw_perf_event *hwc = &event->hw;
@@ -9149,7 +9157,10 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event)
ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer);
local64_set(&hwc->period_left, ktime_to_ns(remaining));
- hrtimer_cancel(&hwc->hrtimer);
+ if (sync)
+ hrtimer_cancel(&hwc->hrtimer);
+ else
+ hrtimer_try_to_cancel(&hwc->hrtimer);
}
}
@@ -9200,7 +9211,7 @@ static void cpu_clock_event_start(struct perf_event *event, int flags)
static void cpu_clock_event_stop(struct perf_event *event, int flags)
{
- perf_swevent_cancel_hrtimer(event);
+ perf_swevent_cancel_hrtimer(event, flags & PERF_EF_NO_WAIT);
cpu_clock_event_update(event);
}
@@ -9277,7 +9288,7 @@ static void task_clock_event_start(struct perf_event *event, int flags)
static void task_clock_event_stop(struct perf_event *event, int flags)
{
- perf_swevent_cancel_hrtimer(event);
+ perf_swevent_cancel_hrtimer(event, flags & PERF_EF_NO_WAIT);
task_clock_event_update(event, event->ctx->time);
}
--
2.14.4
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
allows the writeback rate to be faster if there is no I/O request on a
bcache device. It works well if there is only one bcache device attached
to the cache set. If there are many bcache devices attached to a cache
set, it may introduce performance regression because multiple faster
writeback threads of the idle bcache devices will compete the btree level
locks with the bcache device who have I/O requests coming.
This patch fixes the above issue by only permitting fast writebac when
all bcache devices attached on the cache set are idle. And if one of the
bcache devices has new I/O request coming, minimized all writeback
throughput immediately and let PI controller __update_writeback_rate()
to decide the upcoming writeback rate for each bcache device.
Also when all bcache devices are idle, limited wrieback rate to a small
number is wast of thoughput, especially when backing devices are slower
non-rotation devices (e.g. SATA SSD). This patch sets a max writeback
rate for each backing device if the whole cache set is idle. A faster
writeback rate in idle time means new I/Os may have more available space
for dirty data, and people may observe a better write performance then.
Please note bcache may change its cache mode in run time, and this patch
still works if the cache mode is switched from writeback mode and there
is still dirty data on cache.
Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
Cc: stable(a)vger.kernel.org #4.16+
Signed-off-by: Coly Li <colyli(a)suse.de>
Tested-by: Kai Krakow <kai(a)kaishome.de>
Cc: Michael Lyle <mlyle(a)lyle.org>
Cc: Stefan Priebe <s.priebe(a)profihost.ag>
---
Channgelog:
v2, Fix a deadlock reported by Stefan Priebe.
v1, Initial version.
drivers/md/bcache/bcache.h | 11 ++--
drivers/md/bcache/request.c | 51 ++++++++++++++-
drivers/md/bcache/super.c | 1 +
drivers/md/bcache/sysfs.c | 14 +++--
drivers/md/bcache/util.c | 2 +-
drivers/md/bcache/util.h | 2 +-
drivers/md/bcache/writeback.c | 115 ++++++++++++++++++++++++++--------
7 files changed, 155 insertions(+), 41 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index d6bf294f3907..469ab1a955e0 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -328,13 +328,6 @@ struct cached_dev {
*/
atomic_t has_dirty;
- /*
- * Set to zero by things that touch the backing volume-- except
- * writeback. Incremented by writeback. Used to determine when to
- * accelerate idle writeback.
- */
- atomic_t backing_idle;
-
struct bch_ratelimit writeback_rate;
struct delayed_work writeback_rate_update;
@@ -514,6 +507,8 @@ struct cache_set {
struct cache_accounting accounting;
unsigned long flags;
+ atomic_t idle_counter;
+ atomic_t at_max_writeback_rate;
struct cache_sb sb;
@@ -523,6 +518,8 @@ struct cache_set {
struct bcache_device **devices;
unsigned devices_max_used;
+ /* See set_at_max_writeback_rate() for how it is used */
+ unsigned previous_dirty_dc_nr;
struct list_head cached_devs;
uint64_t cached_dev_sectors;
struct closure caching;
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index ae67f5fa8047..1af3d96abfa5 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1104,6 +1104,43 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio)
/* Cached devices - read & write stuff */
+static void quit_max_writeback_rate(struct cache_set *c,
+ struct cached_dev *this_dc)
+{
+ int i;
+ struct bcache_device *d;
+ struct cached_dev *dc;
+
+ /*
+ * If bch_register_lock is acquired by other attach/detach operations,
+ * waiting here will increase I/O request latency for seconds or more.
+ * To avoid such situation, only writeback rate of current cached device
+ * is set to 1, and __update_write_back() will decide writeback rate
+ * of other cached devices (remember c->idle_counter is 0 now).
+ */
+ if (mutex_trylock(&bch_register_lock)){
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ /*
+ * set writeback rate to default minimum value,
+ * then let update_writeback_rate() to decide the
+ * upcoming rate.
+ */
+ atomic64_set(&dc->writeback_rate.rate, 1);
+ }
+
+ mutex_unlock(&bch_register_lock);
+ } else
+ atomic64_set(&this_dc->writeback_rate.rate, 1);
+}
+
static blk_qc_t cached_dev_make_request(struct request_queue *q,
struct bio *bio)
{
@@ -1119,7 +1156,19 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
return BLK_QC_T_NONE;
}
- atomic_set(&dc->backing_idle, 0);
+ if (d->c) {
+ atomic_set(&d->c->idle_counter, 0);
+ /*
+ * If at_max_writeback_rate of cache set is true and new I/O
+ * comes, quit max writeback rate of all cached devices
+ * attached to this cache set, and set at_max_writeback_rate
+ * to false.
+ */
+ if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) {
+ atomic_set(&d->c->at_max_writeback_rate, 0);
+ quit_max_writeback_rate(d->c, dc);
+ }
+ }
generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
bio_set_dev(bio, dc->bdev);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index fa4058e43202..fa532d9f9353 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1687,6 +1687,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
c->block_bits = ilog2(sb->block_size);
c->nr_uuids = bucket_bytes(c) / sizeof(struct uuid_entry);
c->devices_max_used = 0;
+ c->previous_dirty_dc_nr = 0;
c->btree_pages = bucket_pages(c);
if (c->btree_pages > BTREE_MAX_PAGES)
c->btree_pages = max_t(int, c->btree_pages / 4,
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 225b15aa0340..d719021bff81 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -170,7 +170,8 @@ SHOW(__bch_cached_dev)
var_printf(writeback_running, "%i");
var_print(writeback_delay);
var_print(writeback_percent);
- sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9);
+ sysfs_hprint(writeback_rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
sysfs_hprint(io_errors, atomic_read(&dc->io_errors));
sysfs_printf(io_error_limit, "%i", dc->error_limit);
sysfs_printf(io_disable, "%i", dc->io_disable);
@@ -188,7 +189,8 @@ SHOW(__bch_cached_dev)
char change[20];
s64 next_io;
- bch_hprint(rate, dc->writeback_rate.rate << 9);
+ bch_hprint(rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9);
bch_hprint(target, dc->writeback_rate_target << 9);
bch_hprint(proportional,dc->writeback_rate_proportional << 9);
@@ -255,8 +257,12 @@ STORE(__cached_dev)
sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40);
- sysfs_strtoul_clamp(writeback_rate,
- dc->writeback_rate.rate, 1, INT_MAX);
+ if (attr == &sysfs_writeback_rate) {
+ int v;
+
+ sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX);
+ atomic64_set(&dc->writeback_rate.rate, v);
+ }
sysfs_strtoul_clamp(writeback_rate_update_seconds,
dc->writeback_rate_update_seconds,
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index fc479b026d6d..84f90c3d996d 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
{
uint64_t now = local_clock();
- d->next += div_u64(done * NSEC_PER_SEC, d->rate);
+ d->next += div_u64(done * NSEC_PER_SEC, atomic64_read(&d->rate));
/* Bound the time. Don't let us fall further than 2 seconds behind
* (this prevents unnecessary backlog that would make it impossible
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index cced87f8eb27..7e17f32ab563 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -442,7 +442,7 @@ struct bch_ratelimit {
* Rate at which we want to do work, in units per second
* The units here correspond to the units passed to bch_next_delay()
*/
- uint32_t rate;
+ atomic64_t rate;
};
static inline void bch_ratelimit_reset(struct bch_ratelimit *d)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index ad45ebe1a74b..11ffadc3cf8f 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -49,6 +49,80 @@ static uint64_t __calc_target_rate(struct cached_dev *dc)
return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
}
+static bool set_at_max_writeback_rate(struct cache_set *c,
+ struct cached_dev *dc)
+{
+ int i, dirty_dc_nr = 0;
+ struct bcache_device *d;
+
+ /*
+ * bch_register_lock is acquired in cached_dev_detach_finish() before
+ * calling cancel_writeback_rate_update_dwork() to stop the delayed
+ * kworker writeback_rate_update (where the context we are for now).
+ * Therefore call mutex_lock() here may introduce deadlock when shut
+ * down the bcache device.
+ * c->previous_dirty_dc_nr is used to record previous calculated
+ * dirty_dc_nr when mutex_trylock() last time succeeded. Then if
+ * mutex_trylock() failed here, use c->previous_dirty_dc_nr as dirty
+ * cached device number. Of cause it might be inaccurate, but a few more
+ * or less loop before setting c->at_max_writeback_rate is much better
+ * then a deadlock here.
+ */
+ if (mutex_trylock(&bch_register_lock)) {
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ if (atomic_read(&dc->has_dirty))
+ dirty_dc_nr++;
+ }
+ c->previous_dirty_dc_nr = dirty_dc_nr;
+
+ mutex_unlock(&bch_register_lock);
+ } else
+ dirty_dc_nr = c->previous_dirty_dc_nr;
+
+ /*
+ * Idle_counter is increased everytime when update_writeback_rate()
+ * is rescheduled in. If all backing devices attached to the same
+ * cache set has same dc->writeback_rate_update_seconds value, it
+ * is about 10 rounds of update_writeback_rate() is called on each
+ * backing device, then the code will fall through at set 1 to
+ * c->at_max_writeback_rate, and a max wrteback rate to each
+ * dc->writeback_rate.rate. This is not very accurate but works well
+ * to make sure the whole cache set has no new I/O coming before
+ * writeback rate is set to a max number.
+ */
+ if (atomic_inc_return(&c->idle_counter) < dirty_dc_nr * 10)
+ return false;
+
+ if (atomic_read(&c->at_max_writeback_rate) != 1)
+ atomic_set(&c->at_max_writeback_rate, 1);
+
+
+ atomic64_set(&dc->writeback_rate.rate, INT_MAX);
+
+ /* keep writeback_rate_target as existing value */
+ dc->writeback_rate_proportional = 0;
+ dc->writeback_rate_integral_scaled = 0;
+ dc->writeback_rate_change = 0;
+
+ /*
+ * Check c->idle_counter and c->at_max_writeback_rate agagain in case
+ * new I/O arrives during before set_at_max_writeback_rate() returns.
+ * Then the writeback rate is set to 1, and its new value should be
+ * decided via __update_writeback_rate().
+ */
+ if (atomic_read(&c->idle_counter) < dirty_dc_nr * 10 ||
+ !atomic_read(&c->at_max_writeback_rate))
+ return false;
+
+ return true;
+}
+
static void __update_writeback_rate(struct cached_dev *dc)
{
/*
@@ -104,8 +178,9 @@ static void __update_writeback_rate(struct cached_dev *dc)
dc->writeback_rate_proportional = proportional_scaled;
dc->writeback_rate_integral_scaled = integral_scaled;
- dc->writeback_rate_change = new_rate - dc->writeback_rate.rate;
- dc->writeback_rate.rate = new_rate;
+ dc->writeback_rate_change = new_rate -
+ atomic64_read(&dc->writeback_rate.rate);
+ atomic64_set(&dc->writeback_rate.rate, new_rate);
dc->writeback_rate_target = target;
}
@@ -138,9 +213,16 @@ static void update_writeback_rate(struct work_struct *work)
down_read(&dc->writeback_lock);
- if (atomic_read(&dc->has_dirty) &&
- dc->writeback_percent)
- __update_writeback_rate(dc);
+ if (atomic_read(&dc->has_dirty) && dc->writeback_percent) {
+ /*
+ * If the whole cache set is idle, set_at_max_writeback_rate()
+ * will set writeback rate to a max number. Then it is
+ * unncessary to update writeback rate for an idle cache set
+ * in maximum writeback rate number(s).
+ */
+ if (!set_at_max_writeback_rate(c, dc))
+ __update_writeback_rate(dc);
+ }
up_read(&dc->writeback_lock);
@@ -422,27 +504,6 @@ static void read_dirty(struct cached_dev *dc)
delay = writeback_delay(dc, size);
- /* If the control system would wait for at least half a
- * second, and there's been no reqs hitting the backing disk
- * for awhile: use an alternate mode where we have at most
- * one contiguous set of writebacks in flight at a time. If
- * someone wants to do IO it will be quick, as it will only
- * have to contend with one operation in flight, and we'll
- * be round-tripping data to the backing disk as quickly as
- * it can accept it.
- */
- if (delay >= HZ / 2) {
- /* 3 means at least 1.5 seconds, up to 7.5 if we
- * have slowed way down.
- */
- if (atomic_inc_return(&dc->backing_idle) >= 3) {
- /* Wait for current I/Os to finish */
- closure_sync(&cl);
- /* And immediately launch a new set. */
- delay = 0;
- }
- }
-
while (!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
delay) {
@@ -715,7 +776,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
dc->writeback_running = true;
dc->writeback_percent = 10;
dc->writeback_delay = 30;
- dc->writeback_rate.rate = 1024;
+ atomic64_set(&dc->writeback_rate.rate, 1024);
dc->writeback_rate_minimum = 8;
dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT;
--
2.17.1
Changes since v5 [1]:
* Move put_page() before memory_failure() in madvise_inject_error()
(Naoya)
* The previous change uncovered a latent bug / broken assumption in
__put_devmap_managed_page(). We need to preserve page->mapping for
dax pages when they go idle.
* Rename mapping_size() to dev_pagemap_mapping_size() (Naoya)
* Catch and fail attempts to soft-offline dax pages (Naoya)
* Collect Naoya's ack on "mm, memory_failure: Collect mapping size in
collect_procs()"
[1]: https://lists.01.org/pipermail/linux-nvdimm/2018-July/016682.html
---
As it stands, memory_failure() gets thoroughly confused by dev_pagemap
backed mappings. The recovery code has specific enabling for several
possible page states and needs new enabling to handle poison in dax
mappings.
In order to support reliable reverse mapping of user space addresses:
1/ Add new locking in the memory_failure() rmap path to prevent races
that would typically be handled by the page lock.
2/ Since dev_pagemap pages are hidden from the page allocator and the
"compound page" accounting machinery, add a mechanism to determine the
size of the mapping that encompasses a given poisoned pfn.
3/ Given pmem errors can be repaired, change the speculatively accessed
poison protection, mce_unmap_kpfn(), to be reversible and otherwise
allow ongoing access from the kernel.
A side effect of this enabling is that MADV_HWPOISON becomes usable for
dax mappings, however the primary motivation is to allow the system to
survive userspace consumption of hardware-poison via dax. Specifically
the current behavior is:
mce: Uncorrected hardware memory error in user-access at af34214200
{1}[Hardware Error]: It has been corrected by h/w and requires no further action
mce: [Hardware Error]: Machine check events logged
{1}[Hardware Error]: event severity: corrected
Memory failure: 0xaf34214: reserved kernel page still referenced by 1 users
[..]
Memory failure: 0xaf34214: recovery action for reserved kernel page: Failed
mce: Memory error not recovered
<reboot>
...and with these changes:
Injecting memory failure for pfn 0x20cb00 at process virtual address 0x7f763dd00000
Memory failure: 0x20cb00: Killing dax-pmd:5421 due to hardware memory corruption
Memory failure: 0x20cb00: recovery action for dax page: Recovered
Given all the cross dependencies I propose taking this through
nvdimm.git with acks from Naoya, x86/core, x86/RAS, and of course dax
folks.
---
Dan Williams (13):
device-dax: Convert to vmf_insert_mixed and vm_fault_t
device-dax: Enable page_mapping()
device-dax: Set page->index
filesystem-dax: Set page->index
mm, madvise_inject_error: Disable MADV_SOFT_OFFLINE for ZONE_DEVICE pages
mm, dev_pagemap: Do not clear ->mapping on final put
mm, madvise_inject_error: Let memory_failure() optionally take a page reference
mm, memory_failure: Collect mapping size in collect_procs()
filesystem-dax: Introduce dax_lock_mapping_entry()
mm, memory_failure: Teach memory_failure() about dev_pagemap pages
x86/mm/pat: Prepare {reserve,free}_memtype() for "decoy" addresses
x86/memory_failure: Introduce {set,clear}_mce_nospec()
libnvdimm, pmem: Restore page attributes when clearing errors
arch/x86/include/asm/set_memory.h | 42 ++++++
arch/x86/kernel/cpu/mcheck/mce-internal.h | 15 --
arch/x86/kernel/cpu/mcheck/mce.c | 38 -----
arch/x86/mm/pat.c | 16 ++
drivers/dax/device.c | 75 +++++++---
drivers/nvdimm/pmem.c | 26 ++++
drivers/nvdimm/pmem.h | 13 ++
fs/dax.c | 125 ++++++++++++++++-
include/linux/dax.h | 13 ++
include/linux/huge_mm.h | 5 -
include/linux/mm.h | 1
include/linux/set_memory.h | 14 ++
kernel/memremap.c | 1
mm/hmm.c | 2
mm/huge_memory.c | 4 -
mm/madvise.c | 16 ++
mm/memory-failure.c | 210 +++++++++++++++++++++++------
17 files changed, 481 insertions(+), 135 deletions(-)
error_entry and error_exit communicate the user vs kernel status of
the frame using %ebx. This is unnecessary -- the information is in
regs->cs. Just use regs->cs.
This makes error_entry simpler and makes error_exit more robust.
It also fixes a nasty bug. Before all the Spectre nonsense, The
xen_failsafe_callback entry point returned like this:
ALLOC_PT_GPREGS_ON_STACK
SAVE_C_REGS
SAVE_EXTRA_REGS
ENCODE_FRAME_POINTER
jmp error_exit
And it did not go through error_entry. This was bogus: RBX
contained garbage, and error_exit expected a flag in RBX.
Fortunately, it generally contained *nonzero* garbage, so the
correct code path was used. As part of the Spectre fixes, code was
added to clear RBX to mitigate certain speculation attacks. Now,
depending on kernel configuration, RBX got zeroed and, when running
some Wine workloads, the kernel crashes. This was introduced by:
commit 3ac6d8c787b8 ("x86/entry/64: Clear registers for
exceptions/interrupts, to reduce speculation attack surface")
With this patch applied, RBX is no longer needed as a flag, and the
problem goes away.
I suspect that malicious userspace could use this bug to crash the
kernel even without the offending patch applied, though.
[Historical note: I wrote this patch as a cleanup before I was aware
of the bug it fixed.]
[Note to stable maintainers: this should probably get applied to all
kernels. If you're nervous about that, a more conservative fix to
add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should
also fix the problem.]
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: xen-devel(a)lists.xenproject.org
Cc: x86(a)kernel.org
Cc: stable(a)vger.kernel.org
Fixes: 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
Reported-and-tested-by: "M. Vefa Bicakci" <m.v.b(a)runbox.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
---
I could also submit the conservative fix tagged for -stable and respin
this on top of it. Ingo, Greg, what do you prefer?
arch/x86/entry/entry_64.S | 18 ++++--------------
1 file changed, 4 insertions(+), 14 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 73a522d53b53..8ae7ffda8f98 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -981,7 +981,7 @@ ENTRY(\sym)
call \do_sym
- jmp error_exit /* %ebx: no swapgs flag */
+ jmp error_exit
.endif
END(\sym)
.endm
@@ -1222,7 +1222,6 @@ END(paranoid_exit)
/*
* Save all registers in pt_regs, and switch GS if needed.
- * Return: EBX=0: came from user mode; EBX=1: otherwise
*/
ENTRY(error_entry)
UNWIND_HINT_FUNC
@@ -1269,7 +1268,6 @@ ENTRY(error_entry)
* for these here too.
*/
.Lerror_kernelspace:
- incl %ebx
leaq native_irq_return_iret(%rip), %rcx
cmpq %rcx, RIP+8(%rsp)
je .Lerror_bad_iret
@@ -1303,28 +1301,20 @@ ENTRY(error_entry)
/*
* Pretend that the exception came from user mode: set up pt_regs
- * as if we faulted immediately after IRET and clear EBX so that
- * error_exit knows that we will be returning to user mode.
+ * as if we faulted immediately after IRET.
*/
mov %rsp, %rdi
call fixup_bad_iret
mov %rax, %rsp
- decl %ebx
jmp .Lerror_entry_from_usermode_after_swapgs
END(error_entry)
-
-/*
- * On entry, EBX is a "return to kernel mode" flag:
- * 1: already in kernel mode, don't need SWAPGS
- * 0: user gsbase is loaded, we need SWAPGS and standard preparation for return to usermode
- */
ENTRY(error_exit)
UNWIND_HINT_REGS
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF
- testl %ebx, %ebx
- jnz retint_kernel
+ testb $3, CS(%rsp)
+ jz retint_kernel
jmp retint_user
END(error_exit)
--
2.17.1
The patch titled
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
has been removed from the -mm tree. Its filename was
mm-memcg-fix-use-after-free-in-mem_cgroup_iter.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Jing Xia <jing.xia.mail(a)gmail.com>
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
It was reported that a kernel crash happened in mem_cgroup_iter(), which
can be triggered if the legacy cgroup-v1 non-hierarchical mode is used.
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
......
Call trace:
mem_cgroup_iter+0x2e0/0x6d4
shrink_zone+0x8c/0x324
balance_pgdat+0x450/0x640
kswapd+0x130/0x4b8
kthread+0xe8/0xfc
ret_from_fork+0x10/0x20
mem_cgroup_iter():
......
if (css_tryget(css)) <-- crash here
break;
......
The crashing reason is that mem_cgroup_iter() uses the memcg object whose
pointer is stored in iter->position, which has been freed before and
filled with POISON_FREE(0x6b).
And the root cause of the use-after-free issue is that
invalidate_reclaim_iterators() fails to reset the value of iter->position
to NULL when the css of the memcg is released in non- hierarchical mode.
Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.…
Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
Signed-off-by: Jing Xia <jing.xia.mail(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: <chunyan.zhang(a)unisoc.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter mm/memcontrol.c
--- a/mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter
+++ a/mm/memcontrol.c
@@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators
int nid;
int i;
- while ((memcg = parent_mem_cgroup(memcg))) {
+ for (; memcg; memcg = parent_mem_cgroup(memcg)) {
for_each_node(nid) {
mz = mem_cgroup_nodeinfo(memcg, nid);
for (i = 0; i <= DEF_PRIORITY; i++) {
_
Patches currently in -mm which might be from jing.xia.mail(a)gmail.com are
The patch titled
Subject: mm/huge_memory.c: fix data loss when splitting a file pmd
has been removed from the -mm tree. Its filename was
thp-fix-data-loss-when-splitting-a-file-pmd.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/huge_memory.c: fix data loss when splitting a file pmd
__split_huge_pmd_locked() must check if the cleared huge pmd was dirty,
and propagate that to PageDirty: otherwise, data may be lost when a huge
tmpfs page is modified then split then reclaimed.
How has this taken so long to be noticed? Because there was no problem
when the huge page is written by a write system call (shmem_write_end()
calls set_page_dirty()), nor when the page is allocated for a write fault
(fault_dirty_shared_page() calls set_page_dirty()); but when allocated for
a read fault (which MAP_POPULATE simulates), no set_page_dirty().
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils
Fixes: d21b9e57c74c ("thp: handle file pages in split_huge_pmd()")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Ashwin Chaugule <ashwinch(a)google.com>
Reviewed-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 2 ++
1 file changed, 2 insertions(+)
diff -puN mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd mm/huge_memory.c
--- a/mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd
+++ a/mm/huge_memory.c
@@ -2084,6 +2084,8 @@ static void __split_huge_pmd_locked(stru
if (vma_is_dax(vma))
return;
page = pmd_page(_pmd);
+ if (!PageDirty(page) && pmd_dirty(_pmd))
+ set_page_dirty(page);
if (!PageReferenced(page) && pmd_young(_pmd))
SetPageReferenced(page);
page_remove_rmap(page, true);
_
Patches currently in -mm which might be from hughd(a)google.com are
Linux expects that if a CPU modifies a memory location, then that
modification will eventually become visible to other CPUs in the system.
On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
prioritised over stores so it is possible for a store operation to be
postponed if a polling loop immediately follows it. If the variable
being polled indirectly depends on the outstanding store [for example,
another CPU may be polling the variable that is pending modification]
then there is the potential for deadlock if interrupts are disabled.
This deadlock occurs in qspinlock code.
This patch changes the definition of cpu_relax() to smp_mb() for
Loongson-3, forcing a flushing of the SFB on SMP systems before the
next load takes place. If the Kernel is not compiled for SMP support,
this will expand to a barrier() as before.
References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore)
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/mips/include/asm/processor.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index af34afb..a8c4a3a 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
#define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
#define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
+#ifdef CONFIG_CPU_LOONGSON3
+/*
+ * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
+ * loop. Since spin loops of any kind should have a cpu_relax() in them, force
+ * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
+ * become available as expected.
+ */
+#define cpu_relax() smp_mb()
+#else
#define cpu_relax() barrier()
+#endif
/*
* Return_address is a replacement for __builtin_return_address(count)
--
2.7.0
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e5d54f1935722f83df7619f3978f774c2b802cd8 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Thu, 12 Jul 2018 13:02:53 -0400
Subject: [PATCH] drm/nouveau/drm/nouveau: Fix runtime PM leak in
nv50_disp_atomic_commit()
A CRTC being enabled doesn't mean it's on! It doesn't even necessarily
mean it's being used. This fixes runtime PM leaks on the P50 I've got
next to me.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs(a)redhat.com>
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index 9382e99a0bc7..31b12b4f321a 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -1878,7 +1878,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
nv50_disp_atomic_commit_tail(state);
drm_for_each_crtc(crtc, dev) {
- if (crtc->state->enable) {
+ if (crtc->state->active) {
if (!drm->have_disp_power_ref) {
drm->have_disp_power_ref = true;
return 0;
Hi Greg,
Please consider this patchset, which include block/scsi multiqueue performance
enhancement and bugfix.
We've run multiple benchmark and different tests for over one week, looks
good.
These patches are also included in Oracle UEK5.
They're almost just simple cherry-pick, only 2 patches need minor adjust.
They can apply cleanly on 4.14.57.
Jens Axboe (3):
Revert "blk-mq: don't handle TAG_SHARED in restart"
blk-mq: fix issue with shared tag queue re-running
blk-mq: only run the hardware queue if IO is pending
Jianchao Wang (1):
blk-mq: put the driver tag of nxt rq before first one is requeued
Ming Lei (19):
blk-mq-sched: move actual dispatching into one helper
blk-mq: introduce .get_budget and .put_budget in blk_mq_ops
sbitmap: introduce __sbitmap_for_each_set()
blk-mq-sched: improve dispatching from sw queue
scsi: allow passing in null rq to scsi_prep_state_check()
scsi: implement .get_budget and .put_budget for blk-mq
SCSI: don't get target/host busy_count in scsi_mq_get_budget()
blk-mq: don't handle TAG_SHARED in restart
blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE
blk-mq: don't handle failure in .get_budget
blk-flush: don't run queue for requests bypassing flush
block: pass 'run_queue' to blk_mq_request_bypass_insert
blk-flush: use blk_mq_request_bypass_insert()
blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ
blk-mq: move blk_mq_put_driver_tag*() into blk-mq.h
blk-mq: don't allocate driver tag upfront for flush rq
blk-mq: put driver tag if dispatch budget can't be got
blk-mq: quiesce queue during switching io sched and updating
nr_requests
scsi: core: run queue if SCSI device queue isn't ready and queue is
idle
block/blk-core.c | 2 +-
block/blk-flush.c | 37 +++++--
block/blk-mq-debugfs.c | 1 -
block/blk-mq-sched.c | 203 ++++++++++++++++++++++-------------
block/blk-mq.c | 278 +++++++++++++++++++++++++++---------------------
block/blk-mq.h | 58 +++++++++-
block/elevator.c | 2 +
drivers/scsi/scsi_lib.c | 53 ++++++---
include/linux/blk-mq.h | 20 +++-
include/linux/sbitmap.h | 64 ++++++++---
10 files changed, 475 insertions(+), 243 deletions(-)
--
2.7.4
Function atomic_inc_unless_negative() returns a bool to indicate
success/failure. However cxl_adapter_context_get() wrongly compares
the return value against '>=0' which will always be true. The patch
fixes this comparison to '==0' there by also fixing this compile time
warning:
drivers/misc/cxl/main.c:290 cxl_adapter_context_get()
warn: 'atomic_inc_unless_negative(&adapter->contexts_num)' is unsigned
Cc: stable(a)vger.kernel.org
Fixes: 70b565bbdb91 ("cxl: Prevent adapter reset if an active context exists")
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Vaibhav Jain <vaibhav(a)linux.ibm.com>
---
drivers/misc/cxl/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c
index c1ba0d42cbc8..e0f29b8a872d 100644
--- a/drivers/misc/cxl/main.c
+++ b/drivers/misc/cxl/main.c
@@ -287,7 +287,7 @@ int cxl_adapter_context_get(struct cxl *adapter)
int rc;
rc = atomic_inc_unless_negative(&adapter->contexts_num);
- return rc >= 0 ? 0 : -EBUSY;
+ return rc ? 0 : -EBUSY;
}
void cxl_adapter_context_put(struct cxl *adapter)
--
2.17.1
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
allows the writeback rate to be faster if there is no I/O request on a
bcache device. It works well if there is only one bcache device attached
to the cache set. If there are many bcache devices attached to a cache
set, it may introduce performance regression because multiple faster
writeback threads of the idle bcache devices will compete the btree level
locks with the bcache device who have I/O requests coming.
This patch fixes the above issue by only permitting fast writebac when
all bcache devices attached on the cache set are idle. And if one of the
bcache devices has new I/O request coming, minimized all writeback
throughput immediately and let PI controller __update_writeback_rate()
to decide the upcoming writeback rate for each bcache device.
Also when all bcache devices are idle, limited wrieback rate to a small
number is wast of thoughput, especially when backing devices are slower
non-rotation devices (e.g. SATA SSD). This patch sets a max writeback
rate for each backing device if the whole cache set is idle. A faster
writeback rate in idle time means new I/Os may have more available space
for dirty data, and people may observe a better write performance then.
Please note bcache may change its cache mode in run time, and this patch
still works if the cache mode is switched from writeback mode and there
is still dirty data on cache.
Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
Cc: stable(a)vger.kernel.org #4.16+
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Michael Lyle <mlyle(a)lyle.org>
---
drivers/md/bcache/bcache.h | 9 +---
drivers/md/bcache/request.c | 42 ++++++++++++++-
drivers/md/bcache/sysfs.c | 14 +++--
drivers/md/bcache/util.c | 2 +-
drivers/md/bcache/util.h | 2 +-
drivers/md/bcache/writeback.c | 98 +++++++++++++++++++++++++----------
6 files changed, 126 insertions(+), 41 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index d6bf294f3907..f7451e8be03c 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -328,13 +328,6 @@ struct cached_dev {
*/
atomic_t has_dirty;
- /*
- * Set to zero by things that touch the backing volume-- except
- * writeback. Incremented by writeback. Used to determine when to
- * accelerate idle writeback.
- */
- atomic_t backing_idle;
-
struct bch_ratelimit writeback_rate;
struct delayed_work writeback_rate_update;
@@ -514,6 +507,8 @@ struct cache_set {
struct cache_accounting accounting;
unsigned long flags;
+ atomic_t idle_counter;
+ atomic_t at_max_writeback_rate;
struct cache_sb sb;
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index ae67f5fa8047..fe45f561a054 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1104,6 +1104,34 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio)
/* Cached devices - read & write stuff */
+static void quit_max_writeback_rate(struct cache_set *c)
+{
+ int i;
+ struct bcache_device *d;
+ struct cached_dev *dc;
+
+ mutex_lock(&bch_register_lock);
+
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ /*
+ * set writeback rate to default minimum value,
+ * then let update_writeback_rate() to decide the
+ * upcoming rate.
+ */
+ atomic64_set(&dc->writeback_rate.rate, 1);
+ }
+
+ mutex_unlock(&bch_register_lock);
+}
+
static blk_qc_t cached_dev_make_request(struct request_queue *q,
struct bio *bio)
{
@@ -1119,7 +1147,19 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
return BLK_QC_T_NONE;
}
- atomic_set(&dc->backing_idle, 0);
+ if (d->c) {
+ atomic_set(&d->c->idle_counter, 0);
+ /*
+ * If at_max_writeback_rate of cache set is true and new I/O
+ * comes, quit max writeback rate of all cached devices
+ * attached to this cache set, and set at_max_writeback_rate
+ * to false.
+ */
+ if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) {
+ atomic_set(&d->c->at_max_writeback_rate, 0);
+ quit_max_writeback_rate(d->c);
+ }
+ }
generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
bio_set_dev(bio, dc->bdev);
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 225b15aa0340..d719021bff81 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -170,7 +170,8 @@ SHOW(__bch_cached_dev)
var_printf(writeback_running, "%i");
var_print(writeback_delay);
var_print(writeback_percent);
- sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9);
+ sysfs_hprint(writeback_rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
sysfs_hprint(io_errors, atomic_read(&dc->io_errors));
sysfs_printf(io_error_limit, "%i", dc->error_limit);
sysfs_printf(io_disable, "%i", dc->io_disable);
@@ -188,7 +189,8 @@ SHOW(__bch_cached_dev)
char change[20];
s64 next_io;
- bch_hprint(rate, dc->writeback_rate.rate << 9);
+ bch_hprint(rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9);
bch_hprint(target, dc->writeback_rate_target << 9);
bch_hprint(proportional,dc->writeback_rate_proportional << 9);
@@ -255,8 +257,12 @@ STORE(__cached_dev)
sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40);
- sysfs_strtoul_clamp(writeback_rate,
- dc->writeback_rate.rate, 1, INT_MAX);
+ if (attr == &sysfs_writeback_rate) {
+ int v;
+
+ sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX);
+ atomic64_set(&dc->writeback_rate.rate, v);
+ }
sysfs_strtoul_clamp(writeback_rate_update_seconds,
dc->writeback_rate_update_seconds,
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index fc479b026d6d..84f90c3d996d 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
{
uint64_t now = local_clock();
- d->next += div_u64(done * NSEC_PER_SEC, d->rate);
+ d->next += div_u64(done * NSEC_PER_SEC, atomic64_read(&d->rate));
/* Bound the time. Don't let us fall further than 2 seconds behind
* (this prevents unnecessary backlog that would make it impossible
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index cced87f8eb27..7e17f32ab563 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -442,7 +442,7 @@ struct bch_ratelimit {
* Rate at which we want to do work, in units per second
* The units here correspond to the units passed to bch_next_delay()
*/
- uint32_t rate;
+ atomic64_t rate;
};
static inline void bch_ratelimit_reset(struct bch_ratelimit *d)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index ad45ebe1a74b..72059f910230 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -49,6 +49,63 @@ static uint64_t __calc_target_rate(struct cached_dev *dc)
return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
}
+static bool set_at_max_writeback_rate(struct cache_set *c,
+ struct cached_dev *dc)
+{
+ int i, dirty_dc_nr = 0;
+ struct bcache_device *d;
+
+ mutex_lock(&bch_register_lock);
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ if (atomic_read(&dc->has_dirty))
+ dirty_dc_nr++;
+ }
+ mutex_unlock(&bch_register_lock);
+
+ /*
+ * Idle_counter is increased everytime when update_writeback_rate()
+ * is rescheduled in. If all backing devices attached to the same
+ * cache set has same dc->writeback_rate_update_seconds value, it
+ * is about 10 rounds of update_writeback_rate() is called on each
+ * backing device, then the code will fall through at set 1 to
+ * c->at_max_writeback_rate, and a max wrteback rate to each
+ * dc->writeback_rate.rate. This is not very accurate but works well
+ * to make sure the whole cache set has no new I/O coming before
+ * writeback rate is set to a max number.
+ */
+ if (atomic_inc_return(&c->idle_counter) < dirty_dc_nr * 10)
+ return false;
+
+ if (atomic_read(&c->at_max_writeback_rate) != 1)
+ atomic_set(&c->at_max_writeback_rate, 1);
+
+
+ atomic64_set(&dc->writeback_rate.rate, INT_MAX);
+
+ /* keep writeback_rate_target as existing value */
+ dc->writeback_rate_proportional = 0;
+ dc->writeback_rate_integral_scaled = 0;
+ dc->writeback_rate_change = 0;
+
+ /*
+ * Check c->idle_counter and c->at_max_writeback_rate agagain in case
+ * new I/O arrives during before set_at_max_writeback_rate() returns.
+ * Then the writeback rate is set to 1, and its new value should be
+ * decided via __update_writeback_rate().
+ */
+ if (atomic_read(&c->idle_counter) < dirty_dc_nr * 10 ||
+ !atomic_read(&c->at_max_writeback_rate))
+ return false;
+
+ return true;
+}
+
static void __update_writeback_rate(struct cached_dev *dc)
{
/*
@@ -104,8 +161,9 @@ static void __update_writeback_rate(struct cached_dev *dc)
dc->writeback_rate_proportional = proportional_scaled;
dc->writeback_rate_integral_scaled = integral_scaled;
- dc->writeback_rate_change = new_rate - dc->writeback_rate.rate;
- dc->writeback_rate.rate = new_rate;
+ dc->writeback_rate_change = new_rate -
+ atomic64_read(&dc->writeback_rate.rate);
+ atomic64_set(&dc->writeback_rate.rate, new_rate);
dc->writeback_rate_target = target;
}
@@ -138,9 +196,16 @@ static void update_writeback_rate(struct work_struct *work)
down_read(&dc->writeback_lock);
- if (atomic_read(&dc->has_dirty) &&
- dc->writeback_percent)
- __update_writeback_rate(dc);
+ if (atomic_read(&dc->has_dirty) && dc->writeback_percent) {
+ /*
+ * If the whole cache set is idle, set_at_max_writeback_rate()
+ * will set writeback rate to a max number. Then it is
+ * unncessary to update writeback rate for an idle cache set
+ * in maximum writeback rate number(s).
+ */
+ if (!set_at_max_writeback_rate(c, dc))
+ __update_writeback_rate(dc);
+ }
up_read(&dc->writeback_lock);
@@ -422,27 +487,6 @@ static void read_dirty(struct cached_dev *dc)
delay = writeback_delay(dc, size);
- /* If the control system would wait for at least half a
- * second, and there's been no reqs hitting the backing disk
- * for awhile: use an alternate mode where we have at most
- * one contiguous set of writebacks in flight at a time. If
- * someone wants to do IO it will be quick, as it will only
- * have to contend with one operation in flight, and we'll
- * be round-tripping data to the backing disk as quickly as
- * it can accept it.
- */
- if (delay >= HZ / 2) {
- /* 3 means at least 1.5 seconds, up to 7.5 if we
- * have slowed way down.
- */
- if (atomic_inc_return(&dc->backing_idle) >= 3) {
- /* Wait for current I/Os to finish */
- closure_sync(&cl);
- /* And immediately launch a new set. */
- delay = 0;
- }
- }
-
while (!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
delay) {
@@ -715,7 +759,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
dc->writeback_running = true;
dc->writeback_percent = 10;
dc->writeback_delay = 30;
- dc->writeback_rate.rate = 1024;
+ atomic64_set(&dc->writeback_rate.rate, 1024);
dc->writeback_rate_minimum = 8;
dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT;
--
2.17.1
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
There are several issues with the suspend/resume handling code of the
driver:
- The device is attached and detached in the runtime_suspend() and
runtime_resume() callbacks if the interface is running. However,
during xcan_chip_start() the interface is considered running,
causing the resume handler to incorrectly call netif_start_queue()
at the beginning of xcan_chip_start(), and on xcan_chip_start() error
return the suspend handler detaches the device leaving the user
unable to bring-up the device anymore.
- The device is not brought properly up on system resume. A reset is
done and the code tries to determine the bus state after that.
However, after reset the device is always in Configuration mode
(down), so the state checking code does not make sense and
communication will also not work.
- The suspend callback tries to set the device to sleep mode (low-power
mode which monitors the bus and brings the device back to normal mode
on activity), but then immediately disables the clocks (possibly
before the device reaches the sleep mode), which does not make sense
to me. If a clean shutdown is wanted before disabling clocks, we can
just bring it down completely instead of only sleep mode.
Reorganize the PM code so that only the clock logic remains in the
runtime PM callbacks and the system PM callbacks contain the device
bring-up/down logic. This makes calling the runtime PM callbacks during
e.g. xcan_chip_start() safe.
The system PM callbacks now simply call common code to start/stop the
HW if the interface was running, replacing the broken code from before.
xcan_chip_stop() is updated to use the common reset code so that it will
wait for the reset to complete. Reset also disables all interrupts so do
not do that separately.
Also, the device_may_wakeup() checks are removed as the driver does not
have wakeup support.
Tested on Zynq-7000 integrated CAN.
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: Michal Simek <michal.simek(a)xilinx.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 69 +++++++++++++++---------------------
1 file changed, 28 insertions(+), 41 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index cb80a9aa7281..5a24039733ef 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -984,13 +984,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
static void xcan_chip_stop(struct net_device *ndev)
{
struct xcan_priv *priv = netdev_priv(ndev);
- u32 ier;
/* Disable interrupts and leave the can in configuration mode */
- ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier &= ~XCAN_INTR_ALL;
- priv->write_reg(priv, XCAN_IER_OFFSET, ier);
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
+ set_reset_mode(ndev);
priv->can.state = CAN_STATE_STOPPED;
}
@@ -1123,10 +1119,15 @@ static const struct net_device_ops xcan_netdev_ops = {
*/
static int __maybe_unused xcan_suspend(struct device *dev)
{
- if (!device_may_wakeup(dev))
- return pm_runtime_force_suspend(dev);
+ struct net_device *ndev = dev_get_drvdata(dev);
- return 0;
+ if (netif_running(ndev)) {
+ netif_stop_queue(ndev);
+ netif_device_detach(ndev);
+ xcan_chip_stop(ndev);
+ }
+
+ return pm_runtime_force_suspend(dev);
}
/**
@@ -1138,11 +1139,27 @@ static int __maybe_unused xcan_suspend(struct device *dev)
*/
static int __maybe_unused xcan_resume(struct device *dev)
{
- if (!device_may_wakeup(dev))
- return pm_runtime_force_resume(dev);
+ struct net_device *ndev = dev_get_drvdata(dev);
+ int ret;
- return 0;
+ ret = pm_runtime_force_resume(dev);
+ if (ret) {
+ dev_err(dev, "pm_runtime_force_resume failed on resume\n");
+ return ret;
+ }
+
+ if (netif_running(ndev)) {
+ ret = xcan_chip_start(ndev);
+ if (ret) {
+ dev_err(dev, "xcan_chip_start failed on resume\n");
+ return ret;
+ }
+ netif_device_attach(ndev);
+ netif_start_queue(ndev);
+ }
+
+ return 0;
}
/**
@@ -1157,14 +1174,6 @@ static int __maybe_unused xcan_runtime_suspend(struct device *dev)
struct net_device *ndev = dev_get_drvdata(dev);
struct xcan_priv *priv = netdev_priv(ndev);
- if (netif_running(ndev)) {
- netif_stop_queue(ndev);
- netif_device_detach(ndev);
- }
-
- priv->write_reg(priv, XCAN_MSR_OFFSET, XCAN_MSR_SLEEP_MASK);
- priv->can.state = CAN_STATE_SLEEPING;
-
clk_disable_unprepare(priv->bus_clk);
clk_disable_unprepare(priv->can_clk);
@@ -1183,7 +1192,6 @@ static int __maybe_unused xcan_runtime_resume(struct device *dev)
struct net_device *ndev = dev_get_drvdata(dev);
struct xcan_priv *priv = netdev_priv(ndev);
int ret;
- u32 isr, status;
ret = clk_prepare_enable(priv->bus_clk);
if (ret) {
@@ -1197,27 +1205,6 @@ static int __maybe_unused xcan_runtime_resume(struct device *dev)
return ret;
}
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
- isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
- status = priv->read_reg(priv, XCAN_SR_OFFSET);
-
- if (netif_running(ndev)) {
- if (isr & XCAN_IXR_BSOFF_MASK) {
- priv->can.state = CAN_STATE_BUS_OFF;
- priv->write_reg(priv, XCAN_SRR_OFFSET,
- XCAN_SRR_RESET_MASK);
- } else if ((status & XCAN_SR_ESTAT_MASK) ==
- XCAN_SR_ESTAT_MASK) {
- priv->can.state = CAN_STATE_ERROR_PASSIVE;
- } else if (status & XCAN_SR_ERRWRN_MASK) {
- priv->can.state = CAN_STATE_ERROR_WARNING;
- } else {
- priv->can.state = CAN_STATE_ERROR_ACTIVE;
- }
- netif_device_attach(ndev);
- netif_start_queue(ndev);
- }
-
return 0;
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
xcan_interrupt() clears ERROR|RXOFLV|BSOFF|ARBLST interrupts if any of
them is asserted. This does not take into account that some of them
could have been asserted between interrupt status read and interrupt
clear, therefore clearing them without handling them.
Fix the code to only clear those interrupts that it knows are asserted
and therefore going to be processed in xcan_err_interrupt().
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: Michal Simek <michal.simek(a)xilinx.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index ea9f9d1a5ba7..cb80a9aa7281 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -938,6 +938,7 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
struct net_device *ndev = (struct net_device *)dev_id;
struct xcan_priv *priv = netdev_priv(ndev);
u32 isr, ier;
+ u32 isr_errors;
/* Get the interrupt status from Xilinx CAN */
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
@@ -956,11 +957,10 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
xcan_tx_interrupt(ndev, isr);
/* Check for the type of error interrupt and Processing it */
- if (isr & (XCAN_IXR_ERROR_MASK | XCAN_IXR_RXOFLW_MASK |
- XCAN_IXR_BSOFF_MASK | XCAN_IXR_ARBLST_MASK)) {
- priv->write_reg(priv, XCAN_ICR_OFFSET, (XCAN_IXR_ERROR_MASK |
- XCAN_IXR_RXOFLW_MASK | XCAN_IXR_BSOFF_MASK |
- XCAN_IXR_ARBLST_MASK));
+ isr_errors = isr & (XCAN_IXR_ERROR_MASK | XCAN_IXR_RXOFLW_MASK |
+ XCAN_IXR_BSOFF_MASK | XCAN_IXR_ARBLST_MASK);
+ if (isr_errors) {
+ priv->write_reg(priv, XCAN_ICR_OFFSET, isr_errors);
xcan_err_interrupt(ndev, isr);
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
The xilinx_can driver assumes that the TXOK interrupt only clears after
it has been acknowledged as many times as there have been successfully
sent frames.
However, the documentation does not mention such behavior, instead
saying just that the interrupt is cleared when the clear bit is set.
Similarly, testing seems to also suggest that it is immediately cleared
regardless of the amount of frames having been sent. Performing some
heavy TX load and then going back to idle has the tx_head drifting
further away from tx_tail over time, steadily reducing the amount of
frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
interrupt always frees up space for 1 frame from the driver's
perspective, so frames continue to be sent) and delaying the local echo
frames.
The TX FIFO tracking is also otherwise buggy as it does not account for
TX FIFO being cleared after software resets, causing
BUG!, TX FIFO full when queue awake!
messages to be output.
There does not seem to be any way to accurately track the state of the
TX FIFO for local echo support while using the full TX FIFO.
The Zynq version of the HW (but not the soft-AXI version) has watermark
programming support and with it an additional TX-FIFO-empty interrupt
bit.
Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI
and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used
to detect whether 1 or 2 frames have been sent at interrupt processing
time.
Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode
was also tested.
An alternative way to solve this would be to drop local echo support but
keep using the full TX FIFO.
v2: Add FIFO space check before TX queue wake with locking to
synchronize with queue stop. This avoids waking the queue when xmit()
had just filled it.
v3: Keep local echo support and reduce the amount of frames in FIFO
instead as suggested by Marc Kleine-Budde.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 139 +++++++++++++++++++++++++++++++----
1 file changed, 123 insertions(+), 16 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 763408a3eafb..dcbdc3cd651c 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -26,8 +26,10 @@
#include <linux/module.h>
#include <linux/netdevice.h>
#include <linux/of.h>
+#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/skbuff.h>
+#include <linux/spinlock.h>
#include <linux/string.h>
#include <linux/types.h>
#include <linux/can/dev.h>
@@ -119,6 +121,7 @@ enum xcan_reg {
/**
* struct xcan_priv - This definition define CAN driver instance
* @can: CAN private data structure.
+ * @tx_lock: Lock for synchronizing TX interrupt handling
* @tx_head: Tx CAN packets ready to send on the queue
* @tx_tail: Tx CAN packets successfully sended on the queue
* @tx_max: Maximum number packets the driver can send
@@ -133,6 +136,7 @@ enum xcan_reg {
*/
struct xcan_priv {
struct can_priv can;
+ spinlock_t tx_lock;
unsigned int tx_head;
unsigned int tx_tail;
unsigned int tx_max;
@@ -160,6 +164,11 @@ static const struct can_bittiming_const xcan_bittiming_const = {
.brp_inc = 1,
};
+#define XCAN_CAP_WATERMARK 0x0001
+struct xcan_devtype_data {
+ unsigned int caps;
+};
+
/**
* xcan_write_reg_le - Write a value to the device register little endian
* @priv: Driver private data structure
@@ -239,6 +248,10 @@ static int set_reset_mode(struct net_device *ndev)
usleep_range(500, 10000);
}
+ /* reset clears FIFOs */
+ priv->tx_head = 0;
+ priv->tx_tail = 0;
+
return 0;
}
@@ -393,6 +406,7 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
struct net_device_stats *stats = &ndev->stats;
struct can_frame *cf = (struct can_frame *)skb->data;
u32 id, dlc, data[2] = {0, 0};
+ unsigned long flags;
if (can_dropped_invalid_skb(ndev, skb))
return NETDEV_TX_OK;
@@ -440,6 +454,9 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
data[1] = be32_to_cpup((__be32 *)(cf->data + 4));
can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max);
+
+ spin_lock_irqsave(&priv->tx_lock, flags);
+
priv->tx_head++;
/* Write the Frame to Xilinx CAN TX FIFO */
@@ -455,10 +472,16 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
stats->tx_bytes += cf->can_dlc;
}
+ /* Clear TX-FIFO-empty interrupt for xcan_tx_interrupt() */
+ if (priv->tx_max > 1)
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXFEMP_MASK);
+
/* Check if the TX buffer is full */
if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
netif_stop_queue(ndev);
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+
return NETDEV_TX_OK;
}
@@ -832,19 +855,71 @@ static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
{
struct xcan_priv *priv = netdev_priv(ndev);
struct net_device_stats *stats = &ndev->stats;
+ unsigned int frames_in_fifo;
+ int frames_sent = 1; /* TXOK => at least 1 frame was sent */
+ unsigned long flags;
+ int retries = 0;
+
+ /* Synchronize with xmit as we need to know the exact number
+ * of frames in the FIFO to stay in sync due to the TXFEMP
+ * handling.
+ * This also prevents a race between netif_wake_queue() and
+ * netif_stop_queue().
+ */
+ spin_lock_irqsave(&priv->tx_lock, flags);
+
+ frames_in_fifo = priv->tx_head - priv->tx_tail;
- while ((priv->tx_head - priv->tx_tail > 0) &&
- (isr & XCAN_IXR_TXOK_MASK)) {
+ if (WARN_ON_ONCE(frames_in_fifo == 0)) {
+ /* clear TXOK anyway to avoid getting back here */
priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+ return;
+ }
+
+ /* Check if 2 frames were sent (TXOK only means that at least 1
+ * frame was sent).
+ */
+ if (frames_in_fifo > 1) {
+ WARN_ON(frames_in_fifo > priv->tx_max);
+
+ /* Synchronize TXOK and isr so that after the loop:
+ * (1) isr variable is up-to-date at least up to TXOK clear
+ * time. This avoids us clearing a TXOK of a second frame
+ * but not noticing that the FIFO is now empty and thus
+ * marking only a single frame as sent.
+ * (2) No TXOK is left. Having one could mean leaving a
+ * stray TXOK as we might process the associated frame
+ * via TXFEMP handling as we read TXFEMP *after* TXOK
+ * clear to satisfy (1).
+ */
+ while ((isr & XCAN_IXR_TXOK_MASK) && !WARN_ON(++retries == 100)) {
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
+ }
+
+ if (isr & XCAN_IXR_TXFEMP_MASK) {
+ /* nothing in FIFO anymore */
+ frames_sent = frames_in_fifo;
+ }
+ } else {
+ /* single frame in fifo, just clear TXOK */
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ }
+
+ while (frames_sent--) {
can_get_echo_skb(ndev, priv->tx_tail %
priv->tx_max);
priv->tx_tail++;
stats->tx_packets++;
- isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
}
+
+ netif_wake_queue(ndev);
+
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+
can_led_event(ndev, CAN_LED_EVENT_TX);
xcan_update_error_state_after_rxtx(ndev);
- netif_wake_queue(ndev);
}
/**
@@ -1151,6 +1226,18 @@ static const struct dev_pm_ops xcan_dev_pm_ops = {
SET_RUNTIME_PM_OPS(xcan_runtime_suspend, xcan_runtime_resume, NULL)
};
+static const struct xcan_devtype_data xcan_zynq_data = {
+ .caps = XCAN_CAP_WATERMARK,
+};
+
+/* Match table for OF platform binding */
+static const struct of_device_id xcan_of_match[] = {
+ { .compatible = "xlnx,zynq-can-1.0", .data = &xcan_zynq_data },
+ { .compatible = "xlnx,axi-can-1.00.a", },
+ { /* end of list */ },
+};
+MODULE_DEVICE_TABLE(of, xcan_of_match);
+
/**
* xcan_probe - Platform registration call
* @pdev: Handle to the platform device structure
@@ -1165,8 +1252,10 @@ static int xcan_probe(struct platform_device *pdev)
struct resource *res; /* IO mem resources */
struct net_device *ndev;
struct xcan_priv *priv;
+ const struct of_device_id *of_id;
+ int caps = 0;
void __iomem *addr;
- int ret, rx_max, tx_max;
+ int ret, rx_max, tx_max, tx_fifo_depth;
/* Get the virtual base address for the device */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -1176,7 +1265,8 @@ static int xcan_probe(struct platform_device *pdev)
goto err;
}
- ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth", &tx_max);
+ ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth",
+ &tx_fifo_depth);
if (ret < 0)
goto err;
@@ -1184,6 +1274,30 @@ static int xcan_probe(struct platform_device *pdev)
if (ret < 0)
goto err;
+ of_id = of_match_device(xcan_of_match, &pdev->dev);
+ if (of_id) {
+ const struct xcan_devtype_data *devtype_data = of_id->data;
+
+ if (devtype_data)
+ caps = devtype_data->caps;
+ }
+
+ /* There is no way to directly figure out how many frames have been
+ * sent when the TXOK interrupt is processed. If watermark programming
+ * is supported, we can have 2 frames in the FIFO and use TXFEMP
+ * to determine if 1 or 2 frames have been sent.
+ * Theoretically we should be able to use TXFWMEMP to determine up
+ * to 3 frames, but it seems that after putting a second frame in the
+ * FIFO, with watermark at 2 frames, it can happen that TXFWMEMP (less
+ * than 2 frames in FIFO) is set anyway with no TXOK (a frame was
+ * sent), which is not a sensible state - possibly TXFWMEMP is not
+ * completely synchronized with the rest of the bits?
+ */
+ if (caps & XCAN_CAP_WATERMARK)
+ tx_max = min(tx_fifo_depth, 2);
+ else
+ tx_max = 1;
+
/* Create a CAN device instance */
ndev = alloc_candev(sizeof(struct xcan_priv), tx_max);
if (!ndev)
@@ -1198,6 +1312,7 @@ static int xcan_probe(struct platform_device *pdev)
CAN_CTRLMODE_BERR_REPORTING;
priv->reg_base = addr;
priv->tx_max = tx_max;
+ spin_lock_init(&priv->tx_lock);
/* Get IRQ for the device */
ndev->irq = platform_get_irq(pdev, 0);
@@ -1262,9 +1377,9 @@ static int xcan_probe(struct platform_device *pdev)
pm_runtime_put(&pdev->dev);
- netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth:%d\n",
+ netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth: actual %d, using %d\n",
priv->reg_base, ndev->irq, priv->can.clock.freq,
- priv->tx_max);
+ tx_fifo_depth, priv->tx_max);
return 0;
@@ -1298,14 +1413,6 @@ static int xcan_remove(struct platform_device *pdev)
return 0;
}
-/* Match table for OF platform binding */
-static const struct of_device_id xcan_of_match[] = {
- { .compatible = "xlnx,zynq-can-1.0", },
- { .compatible = "xlnx,axi-can-1.00.a", },
- { /* end of list */ },
-};
-MODULE_DEVICE_TABLE(of, xcan_of_match);
-
static struct platform_driver xcan_driver = {
.probe = xcan_probe,
.remove = xcan_remove,
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
If the device gets into a state where RXNEMP (RX FIFO not empty)
interrupt is asserted without RXOK (new frame received successfully)
interrupt being asserted, xcan_rx_poll() will continue to try to clear
RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
not empty, the interrupt will not be cleared and napi_schedule() will
just be called again.
This situation can occur when:
(a) xcan_rx() returns without reading RX FIFO due to an error condition.
The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
due to a frame still being in the FIFO. The frame will never be read
from the FIFO as RXOK is no longer set.
(b) A frame is received between xcan_rx_poll() reading interrupt status
and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
set as the new message is still in the FIFO.
I'm able to trigger case (b) by flooding the bus with frames under load.
There does not seem to be any benefit in using both RXNEMP and RXOK in
the way the driver does, and the polling example in the reference manual
(UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
RXOK or RXNEMP can be used for detecting incoming messages.
Fix the issue and simplify the RX processing by only using RXNEMP
without RXOK.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 389a9603db8c..1bda47aa62f5 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -101,7 +101,7 @@ enum xcan_reg {
#define XCAN_INTR_ALL (XCAN_IXR_TXOK_MASK | XCAN_IXR_BSOFF_MASK |\
XCAN_IXR_WKUP_MASK | XCAN_IXR_SLP_MASK | \
XCAN_IXR_RXNEMP_MASK | XCAN_IXR_ERROR_MASK | \
- XCAN_IXR_ARBLST_MASK | XCAN_IXR_RXOK_MASK)
+ XCAN_IXR_ARBLST_MASK)
/* CAN register bit shift - XCAN_<REG>_<BIT>_SHIFT */
#define XCAN_BTR_SJW_SHIFT 7 /* Synchronous jump width */
@@ -708,15 +708,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
while ((isr & XCAN_IXR_RXNEMP_MASK) && (work_done < quota)) {
- if (isr & XCAN_IXR_RXOK_MASK) {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXOK_MASK);
- work_done += xcan_rx(ndev);
- } else {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXNEMP_MASK);
- break;
- }
+ work_done += xcan_rx(ndev);
priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_RXNEMP_MASK);
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
}
@@ -727,7 +719,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
if (work_done < quota) {
napi_complete_done(napi, work_done);
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier |= (XCAN_IXR_RXOK_MASK | XCAN_IXR_RXNEMP_MASK);
+ ier |= XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
}
return work_done;
@@ -799,9 +791,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
}
/* Check for the type of receive interrupt and Processing it */
- if (isr & (XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK)) {
+ if (isr & XCAN_IXR_RXNEMP_MASK) {
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier &= ~(XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK);
+ ier &= ~XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
napi_schedule(&priv->napi);
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
The xilinx_can driver performs a software reset when an RX overrun is
detected. This causes the device to enter Configuration mode where no
messages are received or transmitted.
The documentation does not mention any need to perform a reset on an RX
overrun, and testing by inducing an RX overflow also indicated that the
device continues to work just fine without a reset.
Remove the software reset.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 89aec07c225f..389a9603db8c 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -600,7 +600,6 @@ static void xcan_err_interrupt(struct net_device *ndev, u32 isr)
if (isr & XCAN_IXR_RXOFLW_MASK) {
stats->rx_over_errors++;
stats->rx_errors++;
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
if (skb) {
cf->can_id |= CAN_ERR_CRTL;
cf->data[1] |= CAN_ERR_CRTL_RX_OVERFLOW;
--
2.18.0
From: Faiz Abbas <faiz_abbas(a)ti.com>
pm_runtime_get_sync() returns a 1 if the state of the device is already
'active'. This is not a failure case and should return a success.
Therefore fix error handling for pm_runtime_get_sync() call such that
it returns success when the value is 1.
Also cleanup the TODO for using runtime PM for sleep mode as that is
implemented.
Signed-off-by: Faiz Abbas <faiz_abbas(a)ti.com>
Cc: <stable(a)vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/m_can/m_can.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 8e2b7f873c4d..e2f965c2e3aa 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -634,10 +634,12 @@ static int m_can_clk_start(struct m_can_priv *priv)
int err;
err = pm_runtime_get_sync(priv->device);
- if (err)
+ if (err < 0) {
pm_runtime_put_noidle(priv->device);
+ return err;
+ }
- return err;
+ return 0;
}
static void m_can_clk_stop(struct m_can_priv *priv)
@@ -1688,8 +1690,6 @@ static int m_can_plat_probe(struct platform_device *pdev)
return ret;
}
-/* TODO: runtime PM with power down or sleep mode */
-
static __maybe_unused int m_can_suspend(struct device *dev)
{
struct net_device *ndev = dev_get_drvdata(dev);
--
2.18.0
From: Roman Fietze <roman.fietze(a)telemotive.de>
Inside m_can_chip_config(), when setting up the new value of the CCCR,
the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON,
CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for
CAN_CTRLMODE_FD_NON_ISO.
This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO,
this mode could never be cleared again.
This fix is only relevant for controllers with version 3.1.x or 3.2.x.
Older versions do not support NISO.
Signed-off-by: Roman Fietze <roman.fietze(a)telemotive.de>
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/m_can/m_can.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index b397a33f3d32..8e2b7f873c4d 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1109,7 +1109,8 @@ static void m_can_chip_config(struct net_device *dev)
} else {
/* Version 3.1.x or 3.2.x */
- cccr &= ~(CCCR_TEST | CCCR_MON | CCCR_BRSE | CCCR_FDOE);
+ cccr &= ~(CCCR_TEST | CCCR_MON | CCCR_BRSE | CCCR_FDOE |
+ CCCR_NISO);
/* Only 3.2.x has NISO Bit implemented */
if (priv->can.ctrlmode & CAN_CTRLMODE_FD_NON_ISO)
--
2.18.0
From: Stephane Grosjean <s.grosjean(a)peak-system.com>
The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards
family is not capable of handling a mix of 32-bit and 64-bit logical
addresses. If the board is equipped with 2 or 4 CAN ports, then such a
situation might lead to a PCIe Bus Error "Malformed TLP" packet
as well as "irq xx: nobody cared" issue.
This patch adds a workaround that requests only 32-bit DMA addresses
when these might be allocated outside of the 4 GB area.
This issue has been fixed in firmware v3.3.0 and next.
Signed-off-by: Stephane Grosjean <s.grosjean(a)peak-system.com>
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/peak_canfd/peak_pciefd_main.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/net/can/peak_canfd/peak_pciefd_main.c b/drivers/net/can/peak_canfd/peak_pciefd_main.c
index b9e28578bc7b..455a3797a200 100644
--- a/drivers/net/can/peak_canfd/peak_pciefd_main.c
+++ b/drivers/net/can/peak_canfd/peak_pciefd_main.c
@@ -58,6 +58,10 @@ MODULE_LICENSE("GPL v2");
#define PCIEFD_REG_SYS_VER1 0x0040 /* version reg #1 */
#define PCIEFD_REG_SYS_VER2 0x0044 /* version reg #2 */
+#define PCIEFD_FW_VERSION(x, y, z) (((u32)(x) << 24) | \
+ ((u32)(y) << 16) | \
+ ((u32)(z) << 8))
+
/* System Control Registers Bits */
#define PCIEFD_SYS_CTL_TS_RST 0x00000001 /* timestamp clock */
#define PCIEFD_SYS_CTL_CLK_EN 0x00000002 /* system clock */
@@ -782,6 +786,21 @@ static int peak_pciefd_probe(struct pci_dev *pdev,
"%ux CAN-FD PCAN-PCIe FPGA v%u.%u.%u:\n", can_count,
hw_ver_major, hw_ver_minor, hw_ver_sub);
+#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+ /* FW < v3.3.0 DMA logic doesn't handle correctly the mix of 32-bit and
+ * 64-bit logical addresses: this workaround forces usage of 32-bit
+ * DMA addresses only when such a fw is detected.
+ */
+ if (PCIEFD_FW_VERSION(hw_ver_major, hw_ver_minor, hw_ver_sub) <
+ PCIEFD_FW_VERSION(3, 3, 0)) {
+ err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+ if (err)
+ dev_warn(&pdev->dev,
+ "warning: can't set DMA mask %llxh (err %d)\n",
+ DMA_BIT_MASK(32), err);
+ }
+#endif
+
/* stop system clock */
pciefd_sys_writereg(pciefd, PCIEFD_SYS_CTL_CLK_EN,
PCIEFD_REG_SYS_CTL_CLR);
--
2.18.0
Greg,
this series contains backports of the following upstream commits:
243a4f8126fc ubi: Introduce vol_ignored()
fdf10ed710c0 ubi: Rework Fastmap attach base code
74f2c6e9a47c ubi: Be more paranoid while seaching for the most recent Fastmap
2e8f08deabbc ubi: Fix races around ubi_refill_pools()
f7d11b33d4e8 ubi: Fix Fastmap's update_vol()
5793f39de7f6 ubi: fastmap: Erase outdated anchor PEBs during attach
The first two patches are not directly stable patches but the other patches
depend on them.
Richard Weinberger (5):
ubi: Introduce vol_ignored()
ubi: Rework Fastmap attach base code
ubi: Be more paranoid while seaching for the most recent Fastmap
ubi: Fix races around ubi_refill_pools()
ubi: Fix Fastmap's update_vol()
Sascha Hauer (1):
ubi: fastmap: Erase outdated anchor PEBs during attach
drivers/mtd/ubi/attach.c | 139 ++++++++++++++++++++++++++---------
drivers/mtd/ubi/eba.c | 4 +-
drivers/mtd/ubi/fastmap-wl.c | 6 +-
drivers/mtd/ubi/fastmap.c | 51 +++++++++++--
drivers/mtd/ubi/ubi.h | 46 +++++++++++-
drivers/mtd/ubi/wl.c | 114 ++++++++++++++++++++++------
6 files changed, 292 insertions(+), 68 deletions(-)
--
2.18.0
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b03897cf318dfc47de33a7ecbc7655584266f034 Mon Sep 17 00:00:00 2001
From: "Gautham R. Shenoy" <ego(a)linux.vnet.ibm.com>
Date: Wed, 18 Jul 2018 14:03:16 +0530
Subject: [PATCH] powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from
stop (idle)
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.
SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored. As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 (Power9 only) and above.
Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.
Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.
Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle")
Cc: stable(a)vger.kernel.org # v4.14+
Reported-by: Florian Weimer <fweimer(a)redhat.com>
Signed-off-by: Gautham R. Shenoy <ego(a)linux.vnet.ibm.com>
Reviewed-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index e734f6e45abc..689306118b48 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -144,7 +144,9 @@ power9_restore_additional_sprs:
mtspr SPRN_MMCR1, r4
ld r3, STOP_MMCR2(r13)
+ ld r4, PACA_SPRG_VDSO(r13)
mtspr SPRN_MMCR2, r3
+ mtspr SPRN_SPRG3, r4
blr
/*
Correcting the stable ML address ...
On 23/07/18 10:59, Jon Hunter wrote:
> Please include the following fix for stable v4.4.y. If the PLL_U is not
> configured by the bootloader, then without this change it will not be
> configured by the kernel and this will cause USB host support to fail
> which uses the PLL_U for its clock.
>
> Please note that this patch did not apply cleanly to v4.4.y, so I have
> back-ported, but the resulting change is the same as the original.
>
>>From 797097301860c64b63346d068ba4fe4992bd5021 Mon Sep 17 00:00:00 2001
> From: Lucas Stach <dev(a)lynxeye.de>
> Date: Mon, 29 Feb 2016 21:46:07 +0100
> Subject: [PATCH] clk: tegra: Fix PLL_U post divider and initial rate on
> Tegra30
>
> commit 797097301860c64b63346d068ba4fe4992bd5021 upstream
>
> The post divider value in the frequency table is wrong as it would lead
> to the PLL producing an output rate of 960 MHz instead of the desired
> 480 MHz. This wasn't a problem as nothing used the table to actually
> initialize the PLL rate, but the bootloader configuration was used
> unaltered.
>
> If the bootloader does not set up the PLL it will fail to come when used
> under Linux. To fix this don't rely on the bootloader, but set the
> correct rate in the clock driver.
>
> Change-Id: I9375c24ef0d48b1b98be10378e8d593299b0453b
> Signed-off-by: Lucas Stach <dev(a)lynxeye.de>
> Signed-off-by: Thierry Reding <treding(a)nvidia.com>
> [jonathanh(a)nvidia.com: Back-ported to stable v4.4.y]
> Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
> ---
> drivers/clk/tegra/clk-tegra30.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/clk/tegra/clk-tegra30.c b/drivers/clk/tegra/clk-tegra30.c
> index 8c41c6fcb9ee..acf83569f86f 100644
> --- a/drivers/clk/tegra/clk-tegra30.c
> +++ b/drivers/clk/tegra/clk-tegra30.c
> @@ -333,11 +333,11 @@ static struct pdiv_map pllu_p[] = {
> };
>
> static struct tegra_clk_pll_freq_table pll_u_freq_table[] = {
> - { 12000000, 480000000, 960, 12, 0, 12},
> - { 13000000, 480000000, 960, 13, 0, 12},
> - { 16800000, 480000000, 400, 7, 0, 5},
> - { 19200000, 480000000, 200, 4, 0, 3},
> - { 26000000, 480000000, 960, 26, 0, 12},
> + { 12000000, 480000000, 960, 12, 2, 12 },
> + { 13000000, 480000000, 960, 13, 2, 12 },
> + { 16800000, 480000000, 400, 7, 2, 5 },
> + { 19200000, 480000000, 200, 4, 2, 3 },
> + { 26000000, 480000000, 960, 26, 2, 12 },
> { 0, 0, 0, 0, 0, 0 },
> };
>
> @@ -1372,6 +1372,7 @@ static struct tegra_clk_init_table init_table[] __initdata = {
> {TEGRA30_CLK_GR2D, TEGRA30_CLK_PLL_C, 300000000, 0},
> {TEGRA30_CLK_GR3D, TEGRA30_CLK_PLL_C, 300000000, 0},
> {TEGRA30_CLK_GR3D2, TEGRA30_CLK_PLL_C, 300000000, 0},
> + { TEGRA30_CLK_PLL_U, TEGRA30_CLK_CLK_MAX, 480000000, 0 },
> {TEGRA30_CLK_CLK_MAX, TEGRA30_CLK_CLK_MAX, 0, 0}, /* This MUST be the last entry. */
> };
>
>
--
nvpublic
Adding stable and lkml.
Sorry for spam others.
-Mukesh
On 7/23/2018 1:57 PM, Mukesh Ojha wrote:
>
> Hi All,
>
> I wanted to discuss about one of the corner case exists in 4.9 kernel
> (4.9.x) where
> If hotplug of one of the CPU fails due to failure in one of the callback,
> which is to be called after "notify:online"(as notify_online will
> create sysfs nodes
> for the hotplug cpu) .
>
> So, while cleaning up notify_dead() does not get called as step
> <https://elixir.bootlin.com/linux/v4.9/ident/step>->skip_onerr set to
> true for "notify:prepare"and due to that sysfs nodes of that cpu does
> not get
> cleaned up which can cause issue in next hotplug attempt of that cpu.
>
> Fails
> cpuhp_up_callbacks
> <https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_up_callbacks> =>
> cpuhp_invoke_callback
> <https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_invoke_callback> =>
> undo_cpu_up <https://elixir.bootlin.com/linux/v4.9/ident/undo_cpu_up>
>
> .name = "notify:prepare",
> .teardown.single = notify_dead
> <https://elixir.bootlin.com/linux/v4.9/ident/notify_dead>,
> .skip_onerr = true,
>
> I think the possible solution here could be to remove the
> - .skip_onerr = true,
>
> for "notify:prepare"so that CPU_DEAD notification get send.
>
> Please, feel free to suggest if it has any side-effect as i don't feel
> any.
>
> Ref:
>
> https://elixir.bootlin.com/linux/v4.9/source/kernel/cpu.c#L458
>
> Cheers,
> Mukesh
>
>
>
>
>
>
Please cherry-pick upstream commits:
d03db2b "compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline
declarations"
0e2e160 "x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>"
d0a8d93 "x86/paravirt: Make native_save_fl() extern inline"
To stable branches 4.4+. They will allow 4.4+ x86 kernels compiled
with Clang and have the configs CONFIG_STACK_PROTECTOR_STRONG and
CONFIG_PARAVIRT to boot. They also allow gcc 5.1+ users to have
consistent `extern inline` semantics.
In response to:
https://android-review.googlesource.com/c/kernel/common/+/716477#message-29…
One of these days I'll remember to cc stable in the commit message
properly...sorry!
--
Thanks,
~Nick Desaulniers
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a8f688ec437dc2045cc8f0c89fe877d5803850da Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 4 May 2018 15:35:46 -0400
Subject: [PATCH] xprtrdma: Return -ENOBUFS when no pages are available
The use of -EAGAIN in rpcrdma_convert_iovs() is a latent bug: the
transport never calls xprt_write_space() when more pages become
available. -ENOBUFS will trigger the correct "delay briefly and call
again" logic.
Fixes: 7a89f9c626e3 ("xprtrdma: Honor ->send_request API contract")
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Cc: stable(a)vger.kernel.org # 4.8+
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index d676106295ff..1d7857919d3d 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -231,7 +231,7 @@ rpcrdma_convert_iovs(struct rpcrdma_xprt *r_xprt, struct xdr_buf *xdrbuf,
*/
*ppages = alloc_page(GFP_ATOMIC);
if (!*ppages)
- return -EAGAIN;
+ return -ENOBUFS;
}
seg->mr_page = *ppages;
seg->mr_offset = (char *)page_base;
Tree/Branch: v4.4.143
Git describe: v4.4.143
Commit: a8ea6276d0 Linux 4.4.143
Build Time: 63 min 40 sec
Passed: 10 / 10 (100.00 %)
Failed: 0 / 10 ( 0.00 %)
Errors: 0
Warnings: 31
Section Mismatches: 0
-------------------------------------------------------------------------------
defconfigs with issues (other than build errors):
19 warnings 0 mismatches : arm64-allmodconfig
17 warnings 0 mismatches : x86_64-allmodconfig
-------------------------------------------------------------------------------
Warnings Summary: 31
3 warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
2 ../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
===============================================================================
Detailed per-defconfig build reports below:
-------------------------------------------------------------------------------
arm64-allmodconfig : PASS, 0 errors, 19 warnings, 0 section mismatches
Warnings:
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
x86_64-allmodconfig : PASS, 0 errors, 17 warnings, 0 section mismatches
Warnings:
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
Passed with no errors, warnings or mismatches:
arm64-allnoconfig
arm-multi_v5_defconfig
arm-multi_v7_defconfig
x86_64-defconfig
arm-allmodconfig
arm-allnoconfig
x86_64-allnoconfig
arm64-defconfig
While working on extended rand for last_error/first_error timestamps,
I noticed that the endianess is wrong, we access the little-endian
fields in struct ext4_super_block as native-endian when we print them.
This adds a special case in ext4_attr_show() and ext4_attr_store()
to byteswap the superblock fields if needed.
In older kernels, this code was part of super.c, it got moved to sysfs.c
in linux-4.4.
Cc: stable(a)vger.kernel.org
Fixes: 52c198c6820f ("ext4: add sysfs entry showing whether the fs contains errors")
Reviewed-by: Andreas Dilger <adilger(a)dilger.ca>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/ext4/sysfs.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
index f34da0bb8f17..b970a200f20c 100644
--- a/fs/ext4/sysfs.c
+++ b/fs/ext4/sysfs.c
@@ -274,8 +274,12 @@ static ssize_t ext4_attr_show(struct kobject *kobj,
case attr_pointer_ui:
if (!ptr)
return 0;
- return snprintf(buf, PAGE_SIZE, "%u\n",
- *((unsigned int *) ptr));
+ if (a->attr_ptr == ptr_ext4_super_block_offset)
+ return snprintf(buf, PAGE_SIZE, "%u\n",
+ le32_to_cpup(ptr));
+ else
+ return snprintf(buf, PAGE_SIZE, "%u\n",
+ *((unsigned int *) ptr));
case attr_pointer_atomic:
if (!ptr)
return 0;
@@ -308,7 +312,10 @@ static ssize_t ext4_attr_store(struct kobject *kobj,
ret = kstrtoul(skip_spaces(buf), 0, &t);
if (ret)
return ret;
- *((unsigned int *) ptr) = t;
+ if (a->attr_ptr == ptr_ext4_super_block_offset)
+ *((__le32 *) ptr) = cpu_to_le32(t);
+ else
+ *((unsigned int *) ptr) = t;
return len;
case attr_inode_readahead:
return inode_readahead_blks_store(sbi, buf, len);
--
2.9.0
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
commit 229bc19fd7aca4f37964af06e3583c1c8f36b5d6 upstream.
Don't rely on event interrupt (EINT) bit alone to detect pending port
change in resume. If no change event is detected the host may be suspended
again, oterwise roothubs are resumed.
There is a lag in xHC setting EINT. If we don't notice the pending change
in resume, and the controller is runtime suspeded again, it causes the
event handler to assume host is dead as it will fail to read xHC registers
once PCI puts the controller to D3 state.
[ 268.520969] xhci_hcd: xhci_resume: starting port polling.
[ 268.520985] xhci_hcd: xhci_hub_status_data: stopping port polling.
[ 268.521030] xhci_hcd: xhci_suspend: stopping port polling.
[ 268.521040] xhci_hcd: // Setting command ring address to 0x349bd001
[ 268.521139] xhci_hcd: Port Status Change Event for port 3
[ 268.521149] xhci_hcd: resume root hub
[ 268.521163] xhci_hcd: port resume event for port 3
[ 268.521168] xhci_hcd: xHC is not running.
[ 268.521174] xhci_hcd: handle_port_status: starting port polling.
[ 268.596322] xhci_hcd: xhci_hc_died: xHCI host controller not responding, assume dead
The EINT lag is described in a additional note in xhci specs 4.19.2:
"Due to internal xHC scheduling and system delays, there will be a lag
between a change bit being set and the Port Status Change Event that it
generated being written to the Event Ring. If SW reads the PORTSC and
sees a change bit set, there is no guarantee that the corresponding Port
Status Change Event has already been written into the Event Ring."
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
---
drivers/usb/host/xhci.c | 40 +++++++++++++++++++++++++++++++++++++---
drivers/usb/host/xhci.h | 4 ++++
2 files changed, 41 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index e5bccc6d49cf..fe84b36627ec 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -856,6 +856,41 @@ static void xhci_disable_port_wake_on_bits(struct xhci_hcd *xhci)
spin_unlock_irqrestore(&xhci->lock, flags);
}
+static bool xhci_pending_portevent(struct xhci_hcd *xhci)
+{
+ __le32 __iomem **port_array;
+ int port_index;
+ u32 status;
+ u32 portsc;
+
+ status = readl(&xhci->op_regs->status);
+ if (status & STS_EINT)
+ return true;
+ /*
+ * Checking STS_EINT is not enough as there is a lag between a change
+ * bit being set and the Port Status Change Event that it generated
+ * being written to the Event Ring. See note in xhci 1.1 section 4.19.2.
+ */
+
+ port_index = xhci->num_usb2_ports;
+ port_array = xhci->usb2_ports;
+ while (port_index--) {
+ portsc = readl(port_array[port_index]);
+ if (portsc & PORT_CHANGE_MASK ||
+ (portsc & PORT_PLS_MASK) == XDEV_RESUME)
+ return true;
+ }
+ port_index = xhci->num_usb3_ports;
+ port_array = xhci->usb3_ports;
+ while (port_index--) {
+ portsc = readl(port_array[port_index]);
+ if (portsc & PORT_CHANGE_MASK ||
+ (portsc & PORT_PLS_MASK) == XDEV_RESUME)
+ return true;
+ }
+ return false;
+}
+
/*
* Stop HC (not bus-specific)
*
@@ -955,7 +990,7 @@ EXPORT_SYMBOL_GPL(xhci_suspend);
*/
int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
{
- u32 command, temp = 0, status;
+ u32 command, temp = 0;
struct usb_hcd *hcd = xhci_to_hcd(xhci);
struct usb_hcd *secondary_hcd;
int retval = 0;
@@ -1077,8 +1112,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
done:
if (retval == 0) {
/* Resume root hubs only when have pending events. */
- status = readl(&xhci->op_regs->status);
- if (status & STS_EINT) {
+ if (xhci_pending_portevent(xhci)) {
usb_hcd_resume_root_hub(xhci->shared_hcd);
usb_hcd_resume_root_hub(hcd);
}
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 2a72060dda1b..11232e62b898 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -392,6 +392,10 @@ struct xhci_op_regs {
#define PORT_PLC (1 << 22)
/* port configure error change - port failed to configure its link partner */
#define PORT_CEC (1 << 23)
+#define PORT_CHANGE_MASK (PORT_CSC | PORT_PEC | PORT_WRC | PORT_OCC | \
+ PORT_RC | PORT_PLC | PORT_CEC)
+
+
/* Cold Attach Status - xHC can set this bit to report device attached during
* Sx state. Warm port reset should be perfomed to clear this bit and move port
* to connected state.
--
2.17.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 02ce6ce2e1d07e31e8314c761a2caa087ea094ce Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Thu, 12 Jul 2018 08:38:09 -0500
Subject: [PATCH] drm/amdgpu/pp/smu7: use a local variable for toc indexing
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Rather than using the index variable stored in vram. If
the device fails to come back online after a resume cycle,
reads from vram will return all 1s which will cause a
segfault. Based on a patch from Thomas Martitz <kugel(a)rockbox.org>.
This avoids the segfault, but we still need to sort out
why the GPU does not come back online after a resume.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105760
Acked-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index d644a9bb9078..9f407c48d4f0 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -381,6 +381,7 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
uint32_t fw_to_load;
int result = 0;
struct SMU_DRAMData_TOC *toc;
+ uint32_t num_entries = 0;
if (!hwmgr->reload_fw) {
pr_info("skip reloading...\n");
@@ -422,41 +423,41 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
}
toc = (struct SMU_DRAMData_TOC *)smu_data->header;
- toc->num_entries = 0;
toc->structure_version = 1;
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_RLC_G, &toc->entry[toc->num_entries++]),
+ UCODE_ID_RLC_G, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_CE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_CE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_PFP, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_PFP, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_ME, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_ME, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT2, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT2, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA0, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA0, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
if (!hwmgr->not_vf)
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_MEC_STORAGE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
+ toc->num_entries = num_entries;
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_HI, upper_32_bits(smu_data->header_buffer.mc_addr));
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_LO, lower_32_bits(smu_data->header_buffer.mc_addr));
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 02ce6ce2e1d07e31e8314c761a2caa087ea094ce Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Thu, 12 Jul 2018 08:38:09 -0500
Subject: [PATCH] drm/amdgpu/pp/smu7: use a local variable for toc indexing
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Rather than using the index variable stored in vram. If
the device fails to come back online after a resume cycle,
reads from vram will return all 1s which will cause a
segfault. Based on a patch from Thomas Martitz <kugel(a)rockbox.org>.
This avoids the segfault, but we still need to sort out
why the GPU does not come back online after a resume.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105760
Acked-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index d644a9bb9078..9f407c48d4f0 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -381,6 +381,7 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
uint32_t fw_to_load;
int result = 0;
struct SMU_DRAMData_TOC *toc;
+ uint32_t num_entries = 0;
if (!hwmgr->reload_fw) {
pr_info("skip reloading...\n");
@@ -422,41 +423,41 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
}
toc = (struct SMU_DRAMData_TOC *)smu_data->header;
- toc->num_entries = 0;
toc->structure_version = 1;
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_RLC_G, &toc->entry[toc->num_entries++]),
+ UCODE_ID_RLC_G, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_CE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_CE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_PFP, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_PFP, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_ME, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_ME, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT2, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT2, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA0, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA0, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
if (!hwmgr->not_vf)
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_MEC_STORAGE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
+ toc->num_entries = num_entries;
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_HI, upper_32_bits(smu_data->header_buffer.mc_addr));
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_LO, lower_32_bits(smu_data->header_buffer.mc_addr));
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9d4a0d4cdc8b5904ec7c9b9e04bab3e9e60d7a74 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Date: Thu, 5 Jul 2018 14:49:34 -0400
Subject: [PATCH] drm/amdgpu: Verify root PD is mapped into kernel address
space (v4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Problem: When PD/PT update made by CPU root PD was not yet mapped causing
page fault.
Fix: Verify root PD is mapped into CPU address space.
v2:
Make sure that we add the root PD to the relocated list
since then it's get mapped into CPU address space bt default
in amdgpu_vm_update_directories.
v3:
Drop change to not move kernel type BOs to evicted list.
v4:
Remove redundant bo move to relocated list.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index edf16b2b957a..fdcb498f6d19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -107,6 +107,9 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
return;
list_add_tail(&base->bo_list, &bo->va);
+ if (bo->tbo.type == ttm_bo_type_kernel)
+ list_move(&base->vm_status, &vm->relocated);
+
if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
return;
@@ -468,7 +471,6 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev,
pt->parent = amdgpu_bo_ref(parent->base.bo);
amdgpu_vm_bo_base_init(&entry->base, vm, pt);
- list_move(&entry->base.vm_status, &vm->relocated);
}
if (level < AMDGPU_VM_PTB) {
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9d4a0d4cdc8b5904ec7c9b9e04bab3e9e60d7a74 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Date: Thu, 5 Jul 2018 14:49:34 -0400
Subject: [PATCH] drm/amdgpu: Verify root PD is mapped into kernel address
space (v4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Problem: When PD/PT update made by CPU root PD was not yet mapped causing
page fault.
Fix: Verify root PD is mapped into CPU address space.
v2:
Make sure that we add the root PD to the relocated list
since then it's get mapped into CPU address space bt default
in amdgpu_vm_update_directories.
v3:
Drop change to not move kernel type BOs to evicted list.
v4:
Remove redundant bo move to relocated list.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index edf16b2b957a..fdcb498f6d19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -107,6 +107,9 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
return;
list_add_tail(&base->bo_list, &bo->va);
+ if (bo->tbo.type == ttm_bo_type_kernel)
+ list_move(&base->vm_status, &vm->relocated);
+
if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
return;
@@ -468,7 +471,6 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev,
pt->parent = amdgpu_bo_ref(parent->base.bo);
amdgpu_vm_bo_base_init(&entry->base, vm, pt);
- list_move(&entry->base.vm_status, &vm->relocated);
}
if (level < AMDGPU_VM_PTB) {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 76fa4975f3ed12d15762bc979ca44078598ed8ee Mon Sep 17 00:00:00 2001
From: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Date: Tue, 17 Jul 2018 17:19:13 +1000
Subject: [PATCH] KVM: PPC: Check if IOMMU page is contained in the pinned
physical page
A VM which has:
- a DMA capable device passed through to it (eg. network card);
- running a malicious kernel that ignores H_PUT_TCE failure;
- capability of using IOMMU pages bigger that physical pages
can create an IOMMU mapping that exposes (for example) 16MB of
the host physical memory to the device when only 64K was allocated to the VM.
The remaining 16MB - 64K will be some other content of host memory, possibly
including pages of the VM, but also pages of host kernel memory, host
programs or other VMs.
The attacking VM does not control the location of the page it can map,
and is only allowed to map as many pages as it has pages of RAM.
We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
an IOMMU page is contained in the physical page so the PCI hardware won't
get access to unassigned host memory; however this check is missing in
the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
did not hit this yet as the very first time when the mapping happens
we do not have tbl::it_userspace allocated yet and fall back to
the userspace which in turn calls VFIO IOMMU driver, this fails and
the guest does not retry,
This stores the smallest preregistered page size in the preregistered
region descriptor and changes the mm_iommu_xxx API to check this against
the IOMMU page size.
This calculates maximum page size as a minimum of the natural region
alignment and compound page size. For the page shift this uses the shift
returned by find_linux_pte() which indicates how the page is mapped to
the current userspace - if the page is huge and this is not a zero, then
it is a leaf pte and the page is mapped within the range.
Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Cc: stable(a)vger.kernel.org # v4.12+
Signed-off-by: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 896efa559996..79d570cbf332 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
unsigned long ua, unsigned long entries);
extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
#endif
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index d066e37551ec..8c456fa691a5 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
/* This only handles v2 IOMMU type, v1 is handled via ioctl() */
return H_TOO_HARD;
- if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, &hpa)))
+ if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, tbl->it_page_shift, &hpa)))
return H_HARDWARE;
if (mm_iommu_mapped_inc(mem))
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 925fc316a104..5b298f5a1a14 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -279,7 +279,8 @@ static long kvmppc_rm_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
if (!mem)
return H_TOO_HARD;
- if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, &hpa)))
+ if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, tbl->it_page_shift,
+ &hpa)))
return H_HARDWARE;
pua = (void *) vmalloc_to_phys(pua);
@@ -469,7 +470,8 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
mem = mm_iommu_lookup_rm(vcpu->kvm->mm, ua, IOMMU_PAGE_SIZE_4K);
if (mem)
- prereg = mm_iommu_ua_to_hpa_rm(mem, ua, &tces) == 0;
+ prereg = mm_iommu_ua_to_hpa_rm(mem, ua,
+ IOMMU_PAGE_SHIFT_4K, &tces) == 0;
}
if (!prereg) {
diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c
index abb43646927a..a4ca57612558 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -19,6 +19,7 @@
#include <linux/hugetlb.h>
#include <linux/swap.h>
#include <asm/mmu_context.h>
+#include <asm/pte-walk.h>
static DEFINE_MUTEX(mem_list_mutex);
@@ -27,6 +28,7 @@ struct mm_iommu_table_group_mem_t {
struct rcu_head rcu;
unsigned long used;
atomic64_t mapped;
+ unsigned int pageshift;
u64 ua; /* userspace address */
u64 entries; /* number of entries in hpas[] */
u64 *hpas; /* vmalloc'ed */
@@ -125,6 +127,8 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
{
struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0;
+ unsigned int pageshift;
+ unsigned long flags;
struct page *page = NULL;
mutex_lock(&mem_list_mutex);
@@ -159,6 +163,12 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
goto unlock_exit;
}
+ /*
+ * For a starting point for a maximum page size calculation
+ * we use @ua and @entries natural alignment to allow IOMMU pages
+ * smaller than huge pages but still bigger than PAGE_SIZE.
+ */
+ mem->pageshift = __ffs(ua | (entries << PAGE_SHIFT));
mem->hpas = vzalloc(array_size(entries, sizeof(mem->hpas[0])));
if (!mem->hpas) {
kfree(mem);
@@ -199,6 +209,23 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
}
}
populate:
+ pageshift = PAGE_SHIFT;
+ if (PageCompound(page)) {
+ pte_t *pte;
+ struct page *head = compound_head(page);
+ unsigned int compshift = compound_order(head);
+
+ local_irq_save(flags); /* disables as well */
+ pte = find_linux_pte(mm->pgd, ua, NULL, &pageshift);
+ local_irq_restore(flags);
+
+ /* Double check it is still the same pinned page */
+ if (pte && pte_page(*pte) == head &&
+ pageshift == compshift)
+ pageshift = max_t(unsigned int, pageshift,
+ PAGE_SHIFT);
+ }
+ mem->pageshift = min(mem->pageshift, pageshift);
mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
}
@@ -349,7 +376,7 @@ struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
EXPORT_SYMBOL_GPL(mm_iommu_find);
long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
u64 *va = &mem->hpas[entry];
@@ -357,6 +384,9 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
*hpa = *va | (ua & ~PAGE_MASK);
return 0;
@@ -364,7 +394,7 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
EXPORT_SYMBOL_GPL(mm_iommu_ua_to_hpa);
long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
void *va = &mem->hpas[entry];
@@ -373,6 +403,9 @@ long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
pa = (void *) vmalloc_to_phys(va);
if (!pa)
return -EFAULT;
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 2da5f054257a..7cd63b0c1a46 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -467,7 +467,7 @@ static int tce_iommu_prereg_ua_to_hpa(struct tce_container *container,
if (!mem)
return -EINVAL;
- ret = mm_iommu_ua_to_hpa(mem, tce, phpa);
+ ret = mm_iommu_ua_to_hpa(mem, tce, shift, phpa);
if (ret)
return -EINVAL;
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 76fa4975f3ed12d15762bc979ca44078598ed8ee Mon Sep 17 00:00:00 2001
From: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Date: Tue, 17 Jul 2018 17:19:13 +1000
Subject: [PATCH] KVM: PPC: Check if IOMMU page is contained in the pinned
physical page
A VM which has:
- a DMA capable device passed through to it (eg. network card);
- running a malicious kernel that ignores H_PUT_TCE failure;
- capability of using IOMMU pages bigger that physical pages
can create an IOMMU mapping that exposes (for example) 16MB of
the host physical memory to the device when only 64K was allocated to the VM.
The remaining 16MB - 64K will be some other content of host memory, possibly
including pages of the VM, but also pages of host kernel memory, host
programs or other VMs.
The attacking VM does not control the location of the page it can map,
and is only allowed to map as many pages as it has pages of RAM.
We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
an IOMMU page is contained in the physical page so the PCI hardware won't
get access to unassigned host memory; however this check is missing in
the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
did not hit this yet as the very first time when the mapping happens
we do not have tbl::it_userspace allocated yet and fall back to
the userspace which in turn calls VFIO IOMMU driver, this fails and
the guest does not retry,
This stores the smallest preregistered page size in the preregistered
region descriptor and changes the mm_iommu_xxx API to check this against
the IOMMU page size.
This calculates maximum page size as a minimum of the natural region
alignment and compound page size. For the page shift this uses the shift
returned by find_linux_pte() which indicates how the page is mapped to
the current userspace - if the page is huge and this is not a zero, then
it is a leaf pte and the page is mapped within the range.
Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Cc: stable(a)vger.kernel.org # v4.12+
Signed-off-by: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 896efa559996..79d570cbf332 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
unsigned long ua, unsigned long entries);
extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
#endif
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index d066e37551ec..8c456fa691a5 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
/* This only handles v2 IOMMU type, v1 is handled via ioctl() */
return H_TOO_HARD;
- if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, &hpa)))
+ if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, tbl->it_page_shift, &hpa)))
return H_HARDWARE;
if (mm_iommu_mapped_inc(mem))
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 925fc316a104..5b298f5a1a14 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -279,7 +279,8 @@ static long kvmppc_rm_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
if (!mem)
return H_TOO_HARD;
- if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, &hpa)))
+ if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, tbl->it_page_shift,
+ &hpa)))
return H_HARDWARE;
pua = (void *) vmalloc_to_phys(pua);
@@ -469,7 +470,8 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
mem = mm_iommu_lookup_rm(vcpu->kvm->mm, ua, IOMMU_PAGE_SIZE_4K);
if (mem)
- prereg = mm_iommu_ua_to_hpa_rm(mem, ua, &tces) == 0;
+ prereg = mm_iommu_ua_to_hpa_rm(mem, ua,
+ IOMMU_PAGE_SHIFT_4K, &tces) == 0;
}
if (!prereg) {
diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c
index abb43646927a..a4ca57612558 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -19,6 +19,7 @@
#include <linux/hugetlb.h>
#include <linux/swap.h>
#include <asm/mmu_context.h>
+#include <asm/pte-walk.h>
static DEFINE_MUTEX(mem_list_mutex);
@@ -27,6 +28,7 @@ struct mm_iommu_table_group_mem_t {
struct rcu_head rcu;
unsigned long used;
atomic64_t mapped;
+ unsigned int pageshift;
u64 ua; /* userspace address */
u64 entries; /* number of entries in hpas[] */
u64 *hpas; /* vmalloc'ed */
@@ -125,6 +127,8 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
{
struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0;
+ unsigned int pageshift;
+ unsigned long flags;
struct page *page = NULL;
mutex_lock(&mem_list_mutex);
@@ -159,6 +163,12 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
goto unlock_exit;
}
+ /*
+ * For a starting point for a maximum page size calculation
+ * we use @ua and @entries natural alignment to allow IOMMU pages
+ * smaller than huge pages but still bigger than PAGE_SIZE.
+ */
+ mem->pageshift = __ffs(ua | (entries << PAGE_SHIFT));
mem->hpas = vzalloc(array_size(entries, sizeof(mem->hpas[0])));
if (!mem->hpas) {
kfree(mem);
@@ -199,6 +209,23 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
}
}
populate:
+ pageshift = PAGE_SHIFT;
+ if (PageCompound(page)) {
+ pte_t *pte;
+ struct page *head = compound_head(page);
+ unsigned int compshift = compound_order(head);
+
+ local_irq_save(flags); /* disables as well */
+ pte = find_linux_pte(mm->pgd, ua, NULL, &pageshift);
+ local_irq_restore(flags);
+
+ /* Double check it is still the same pinned page */
+ if (pte && pte_page(*pte) == head &&
+ pageshift == compshift)
+ pageshift = max_t(unsigned int, pageshift,
+ PAGE_SHIFT);
+ }
+ mem->pageshift = min(mem->pageshift, pageshift);
mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
}
@@ -349,7 +376,7 @@ struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
EXPORT_SYMBOL_GPL(mm_iommu_find);
long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
u64 *va = &mem->hpas[entry];
@@ -357,6 +384,9 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
*hpa = *va | (ua & ~PAGE_MASK);
return 0;
@@ -364,7 +394,7 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
EXPORT_SYMBOL_GPL(mm_iommu_ua_to_hpa);
long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
void *va = &mem->hpas[entry];
@@ -373,6 +403,9 @@ long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
pa = (void *) vmalloc_to_phys(va);
if (!pa)
return -EFAULT;
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 2da5f054257a..7cd63b0c1a46 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -467,7 +467,7 @@ static int tce_iommu_prereg_ua_to_hpa(struct tce_container *container,
if (!mem)
return -EINVAL;
- ret = mm_iommu_ua_to_hpa(mem, tce, phpa);
+ ret = mm_iommu_ua_to_hpa(mem, tce, shift, phpa);
if (ret)
return -EINVAL;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd3599a0e142cd73edd3b6801068ac3f48ac771a Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Thu, 12 Jul 2018 01:36:43 +0100
Subject: [PATCH] Btrfs: fix file data corruption after cloning a range and
fsync
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable(a)vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1aa91d57404a..8fd86e29085f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4241,8 +4241,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
struct extent_map *em;
u64 start = page_offset(page);
u64 end = start + PAGE_SIZE - 1;
- struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree;
- struct extent_map_tree *map = &BTRFS_I(page->mapping->host)->extent_tree;
+ struct btrfs_inode *btrfs_inode = BTRFS_I(page->mapping->host);
+ struct extent_io_tree *tree = &btrfs_inode->io_tree;
+ struct extent_map_tree *map = &btrfs_inode->extent_tree;
if (gfpflags_allow_blocking(mask) &&
page->mapping->host->i_size > SZ_16M) {
@@ -4265,6 +4266,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
extent_map_end(em) - 1,
EXTENT_LOCKED | EXTENT_WRITEBACK,
0, NULL)) {
+ set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+ &btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd3599a0e142cd73edd3b6801068ac3f48ac771a Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Thu, 12 Jul 2018 01:36:43 +0100
Subject: [PATCH] Btrfs: fix file data corruption after cloning a range and
fsync
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable(a)vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1aa91d57404a..8fd86e29085f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4241,8 +4241,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
struct extent_map *em;
u64 start = page_offset(page);
u64 end = start + PAGE_SIZE - 1;
- struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree;
- struct extent_map_tree *map = &BTRFS_I(page->mapping->host)->extent_tree;
+ struct btrfs_inode *btrfs_inode = BTRFS_I(page->mapping->host);
+ struct extent_io_tree *tree = &btrfs_inode->io_tree;
+ struct extent_map_tree *map = &btrfs_inode->extent_tree;
if (gfpflags_allow_blocking(mask) &&
page->mapping->host->i_size > SZ_16M) {
@@ -4265,6 +4266,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
extent_map_end(em) - 1,
EXTENT_LOCKED | EXTENT_WRITEBACK,
0, NULL)) {
+ set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+ &btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd3599a0e142cd73edd3b6801068ac3f48ac771a Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Thu, 12 Jul 2018 01:36:43 +0100
Subject: [PATCH] Btrfs: fix file data corruption after cloning a range and
fsync
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable(a)vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1aa91d57404a..8fd86e29085f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4241,8 +4241,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
struct extent_map *em;
u64 start = page_offset(page);
u64 end = start + PAGE_SIZE - 1;
- struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree;
- struct extent_map_tree *map = &BTRFS_I(page->mapping->host)->extent_tree;
+ struct btrfs_inode *btrfs_inode = BTRFS_I(page->mapping->host);
+ struct extent_io_tree *tree = &btrfs_inode->io_tree;
+ struct extent_map_tree *map = &btrfs_inode->extent_tree;
if (gfpflags_allow_blocking(mask) &&
page->mapping->host->i_size > SZ_16M) {
@@ -4265,6 +4266,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
extent_map_end(em) - 1,
EXTENT_LOCKED | EXTENT_WRITEBACK,
0, NULL)) {
+ set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+ &btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 95d6c0857e54b788982746071130d822a795026b Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Wed, 18 Jul 2018 13:38:37 +0200
Subject: [PATCH] cpufreq: intel_pstate: Register when ACPI PCCH is present
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744da (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann(a)suse.com>
Reviewed-by: Andreas Herrmann <aherrmann(a)suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann(a)suse.com>
Cc: 4.16+ <stable(a)vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index ece120da3353..3c3971256130 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2394,6 +2394,18 @@ static bool __init intel_pstate_no_acpi_pss(void)
return true;
}
+static bool __init intel_pstate_no_acpi_pcch(void)
+{
+ acpi_status status;
+ acpi_handle handle;
+
+ status = acpi_get_handle(NULL, "\\_SB", &handle);
+ if (ACPI_FAILURE(status))
+ return true;
+
+ return !acpi_has_method(handle, "PCCH");
+}
+
static bool __init intel_pstate_has_acpi_ppc(void)
{
int i;
@@ -2453,7 +2465,10 @@ static bool __init intel_pstate_platform_pwr_mgmt_exists(void)
switch (plat_info[idx].data) {
case PSS:
- return intel_pstate_no_acpi_pss();
+ if (!intel_pstate_no_acpi_pss())
+ return false;
+
+ return intel_pstate_no_acpi_pcch();
case PPC:
return intel_pstate_has_acpi_ppc() && !force_load;
}
diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
index 3f0ce2ae35ee..0c56c9759672 100644
--- a/drivers/cpufreq/pcc-cpufreq.c
+++ b/drivers/cpufreq/pcc-cpufreq.c
@@ -580,6 +580,10 @@ static int __init pcc_cpufreq_init(void)
{
int ret;
+ /* Skip initialization if another cpufreq driver is there. */
+ if (cpufreq_get_current_driver())
+ return 0;
+
if (acpi_disabled)
return 0;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 95d6c0857e54b788982746071130d822a795026b Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Wed, 18 Jul 2018 13:38:37 +0200
Subject: [PATCH] cpufreq: intel_pstate: Register when ACPI PCCH is present
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744da (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann(a)suse.com>
Reviewed-by: Andreas Herrmann <aherrmann(a)suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann(a)suse.com>
Cc: 4.16+ <stable(a)vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index ece120da3353..3c3971256130 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2394,6 +2394,18 @@ static bool __init intel_pstate_no_acpi_pss(void)
return true;
}
+static bool __init intel_pstate_no_acpi_pcch(void)
+{
+ acpi_status status;
+ acpi_handle handle;
+
+ status = acpi_get_handle(NULL, "\\_SB", &handle);
+ if (ACPI_FAILURE(status))
+ return true;
+
+ return !acpi_has_method(handle, "PCCH");
+}
+
static bool __init intel_pstate_has_acpi_ppc(void)
{
int i;
@@ -2453,7 +2465,10 @@ static bool __init intel_pstate_platform_pwr_mgmt_exists(void)
switch (plat_info[idx].data) {
case PSS:
- return intel_pstate_no_acpi_pss();
+ if (!intel_pstate_no_acpi_pss())
+ return false;
+
+ return intel_pstate_no_acpi_pcch();
case PPC:
return intel_pstate_has_acpi_ppc() && !force_load;
}
diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
index 3f0ce2ae35ee..0c56c9759672 100644
--- a/drivers/cpufreq/pcc-cpufreq.c
+++ b/drivers/cpufreq/pcc-cpufreq.c
@@ -580,6 +580,10 @@ static int __init pcc_cpufreq_init(void)
{
int ret;
+ /* Skip initialization if another cpufreq driver is there. */
+ if (cpufreq_get_current_driver())
+ return 0;
+
if (acpi_disabled)
return 0;
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
allows the writeback rate to be faster if there is no I/O request on a
bcache device. It works well if there is only one bcache device attached
to the cache set. If there are many bcache devices attached to a cache
set, it may introduce performance regression because multiple faster
writeback threads of the idle bcache devices will compete the btree level
locks with the bcache device who have I/O requests coming.
This patch fixes the above issue by only permitting fast writebac when
all bcache devices attached on the cache set are idle. And if one of the
bcache devices has new I/O request coming, minimized all writeback
throughput immediately and let PI controller __update_writeback_rate()
to decide the upcoming writeback rate for each bcache device.
Also when all bcache devices are idle, limited wrieback rate to a small
number is wast of thoughput, especially when backing devices are slower
non-rotation devices (e.g. SATA SSD). This patch sets a max writeback
rate for each backing device if the whole cache set is idle. A faster
writeback rate in idle time means new I/Os may have more available space
for dirty data, and people may observe a better write performance then.
Please note bcache may change its cache mode in run time, and this patch
still works if the cache mode is switched from writeback mode and there
is still dirty data on cache.
Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
Cc: stable(a)vger.kernel.org #4.16+
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Michael Lyle <mlyle(a)lyle.org>
---
drivers/md/bcache/bcache.h | 9 +---
drivers/md/bcache/request.c | 42 ++++++++++++++-
drivers/md/bcache/sysfs.c | 14 +++--
drivers/md/bcache/util.c | 2 +-
drivers/md/bcache/util.h | 2 +-
drivers/md/bcache/writeback.c | 98 +++++++++++++++++++++++++----------
6 files changed, 126 insertions(+), 41 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index d6bf294f3907..f7451e8be03c 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -328,13 +328,6 @@ struct cached_dev {
*/
atomic_t has_dirty;
- /*
- * Set to zero by things that touch the backing volume-- except
- * writeback. Incremented by writeback. Used to determine when to
- * accelerate idle writeback.
- */
- atomic_t backing_idle;
-
struct bch_ratelimit writeback_rate;
struct delayed_work writeback_rate_update;
@@ -514,6 +507,8 @@ struct cache_set {
struct cache_accounting accounting;
unsigned long flags;
+ atomic_t idle_counter;
+ atomic_t at_max_writeback_rate;
struct cache_sb sb;
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index ae67f5fa8047..fe45f561a054 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1104,6 +1104,34 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio)
/* Cached devices - read & write stuff */
+static void quit_max_writeback_rate(struct cache_set *c)
+{
+ int i;
+ struct bcache_device *d;
+ struct cached_dev *dc;
+
+ mutex_lock(&bch_register_lock);
+
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ /*
+ * set writeback rate to default minimum value,
+ * then let update_writeback_rate() to decide the
+ * upcoming rate.
+ */
+ atomic64_set(&dc->writeback_rate.rate, 1);
+ }
+
+ mutex_unlock(&bch_register_lock);
+}
+
static blk_qc_t cached_dev_make_request(struct request_queue *q,
struct bio *bio)
{
@@ -1119,7 +1147,19 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
return BLK_QC_T_NONE;
}
- atomic_set(&dc->backing_idle, 0);
+ if (d->c) {
+ atomic_set(&d->c->idle_counter, 0);
+ /*
+ * If at_max_writeback_rate of cache set is true and new I/O
+ * comes, quit max writeback rate of all cached devices
+ * attached to this cache set, and set at_max_writeback_rate
+ * to false.
+ */
+ if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) {
+ atomic_set(&d->c->at_max_writeback_rate, 0);
+ quit_max_writeback_rate(d->c);
+ }
+ }
generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
bio_set_dev(bio, dc->bdev);
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 225b15aa0340..d719021bff81 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -170,7 +170,8 @@ SHOW(__bch_cached_dev)
var_printf(writeback_running, "%i");
var_print(writeback_delay);
var_print(writeback_percent);
- sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9);
+ sysfs_hprint(writeback_rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
sysfs_hprint(io_errors, atomic_read(&dc->io_errors));
sysfs_printf(io_error_limit, "%i", dc->error_limit);
sysfs_printf(io_disable, "%i", dc->io_disable);
@@ -188,7 +189,8 @@ SHOW(__bch_cached_dev)
char change[20];
s64 next_io;
- bch_hprint(rate, dc->writeback_rate.rate << 9);
+ bch_hprint(rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9);
bch_hprint(target, dc->writeback_rate_target << 9);
bch_hprint(proportional,dc->writeback_rate_proportional << 9);
@@ -255,8 +257,12 @@ STORE(__cached_dev)
sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40);
- sysfs_strtoul_clamp(writeback_rate,
- dc->writeback_rate.rate, 1, INT_MAX);
+ if (attr == &sysfs_writeback_rate) {
+ int v;
+
+ sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX);
+ atomic64_set(&dc->writeback_rate.rate, v);
+ }
sysfs_strtoul_clamp(writeback_rate_update_seconds,
dc->writeback_rate_update_seconds,
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index fc479b026d6d..84f90c3d996d 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
{
uint64_t now = local_clock();
- d->next += div_u64(done * NSEC_PER_SEC, d->rate);
+ d->next += div_u64(done * NSEC_PER_SEC, atomic64_read(&d->rate));
/* Bound the time. Don't let us fall further than 2 seconds behind
* (this prevents unnecessary backlog that would make it impossible
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index cced87f8eb27..7e17f32ab563 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -442,7 +442,7 @@ struct bch_ratelimit {
* Rate at which we want to do work, in units per second
* The units here correspond to the units passed to bch_next_delay()
*/
- uint32_t rate;
+ atomic64_t rate;
};
static inline void bch_ratelimit_reset(struct bch_ratelimit *d)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index ad45ebe1a74b..72059f910230 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -49,6 +49,63 @@ static uint64_t __calc_target_rate(struct cached_dev *dc)
return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
}
+static bool set_at_max_writeback_rate(struct cache_set *c,
+ struct cached_dev *dc)
+{
+ int i, dirty_dc_nr = 0;
+ struct bcache_device *d;
+
+ mutex_lock(&bch_register_lock);
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ if (atomic_read(&dc->has_dirty))
+ dirty_dc_nr++;
+ }
+ mutex_unlock(&bch_register_lock);
+
+ /*
+ * Idle_counter is increased everytime when update_writeback_rate()
+ * is rescheduled in. If all backing devices attached to the same
+ * cache set has same dc->writeback_rate_update_seconds value, it
+ * is about 10 rounds of update_writeback_rate() is called on each
+ * backing device, then the code will fall through at set 1 to
+ * c->at_max_writeback_rate, and a max wrteback rate to each
+ * dc->writeback_rate.rate. This is not very accurate but works well
+ * to make sure the whole cache set has no new I/O coming before
+ * writeback rate is set to a max number.
+ */
+ if (atomic_inc_return(&c->idle_counter) < dirty_dc_nr * 10)
+ return false;
+
+ if (atomic_read(&c->at_max_writeback_rate) != 1)
+ atomic_set(&c->at_max_writeback_rate, 1);
+
+
+ atomic64_set(&dc->writeback_rate.rate, INT_MAX);
+
+ /* keep writeback_rate_target as existing value */
+ dc->writeback_rate_proportional = 0;
+ dc->writeback_rate_integral_scaled = 0;
+ dc->writeback_rate_change = 0;
+
+ /*
+ * Check c->idle_counter and c->at_max_writeback_rate agagain in case
+ * new I/O arrives during before set_at_max_writeback_rate() returns.
+ * Then the writeback rate is set to 1, and its new value should be
+ * decided via __update_writeback_rate().
+ */
+ if (atomic_read(&c->idle_counter) < dirty_dc_nr * 10 ||
+ !atomic_read(&c->at_max_writeback_rate))
+ return false;
+
+ return true;
+}
+
static void __update_writeback_rate(struct cached_dev *dc)
{
/*
@@ -104,8 +161,9 @@ static void __update_writeback_rate(struct cached_dev *dc)
dc->writeback_rate_proportional = proportional_scaled;
dc->writeback_rate_integral_scaled = integral_scaled;
- dc->writeback_rate_change = new_rate - dc->writeback_rate.rate;
- dc->writeback_rate.rate = new_rate;
+ dc->writeback_rate_change = new_rate -
+ atomic64_read(&dc->writeback_rate.rate);
+ atomic64_set(&dc->writeback_rate.rate, new_rate);
dc->writeback_rate_target = target;
}
@@ -138,9 +196,16 @@ static void update_writeback_rate(struct work_struct *work)
down_read(&dc->writeback_lock);
- if (atomic_read(&dc->has_dirty) &&
- dc->writeback_percent)
- __update_writeback_rate(dc);
+ if (atomic_read(&dc->has_dirty) && dc->writeback_percent) {
+ /*
+ * If the whole cache set is idle, set_at_max_writeback_rate()
+ * will set writeback rate to a max number. Then it is
+ * unncessary to update writeback rate for an idle cache set
+ * in maximum writeback rate number(s).
+ */
+ if (!set_at_max_writeback_rate(c, dc))
+ __update_writeback_rate(dc);
+ }
up_read(&dc->writeback_lock);
@@ -422,27 +487,6 @@ static void read_dirty(struct cached_dev *dc)
delay = writeback_delay(dc, size);
- /* If the control system would wait for at least half a
- * second, and there's been no reqs hitting the backing disk
- * for awhile: use an alternate mode where we have at most
- * one contiguous set of writebacks in flight at a time. If
- * someone wants to do IO it will be quick, as it will only
- * have to contend with one operation in flight, and we'll
- * be round-tripping data to the backing disk as quickly as
- * it can accept it.
- */
- if (delay >= HZ / 2) {
- /* 3 means at least 1.5 seconds, up to 7.5 if we
- * have slowed way down.
- */
- if (atomic_inc_return(&dc->backing_idle) >= 3) {
- /* Wait for current I/Os to finish */
- closure_sync(&cl);
- /* And immediately launch a new set. */
- delay = 0;
- }
- }
-
while (!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
delay) {
@@ -715,7 +759,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
dc->writeback_running = true;
dc->writeback_percent = 10;
dc->writeback_delay = 30;
- dc->writeback_rate.rate = 1024;
+ atomic64_set(&dc->writeback_rate.rate, 1024);
dc->writeback_rate_minimum = 8;
dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT;
--
2.17.1
This is the start of the stable review cycle for the 3.18.116 release.
There are 29 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Jul 22 11:51:47 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.116-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 3.18.116-rc1
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
Santosh Shilimkar <santosh.shilimkar(a)oracle.com>
rds: avoid unenecessary cong_update in loop transport
Eric Biggers <ebiggers(a)google.com>
KEYS: DNS: fix parsing multiple options
Florian Westphal <fw(a)strlen.de>
netfilter: ebtables: reject non-bridge targets
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix command interface race in polling mode
Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
net_sched: blackhole: tell upper qdisc about dropped packets
Jason Wang <jasowang(a)redhat.com>
vhost_net: validate sock before trying to put its fd
Ilpo Järvinen <ilpo.jarvinen(a)helsinki.fi>
tcp: prevent bogus FRTO undos with non-SACK flows
Yuchung Cheng <ycheng(a)google.com>
tcp: fix Fast Open key endianness
Eric Dumazet <edumazet(a)google.com>
net: sungem: fix rx checksum support
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix incorrect raw command length parsing
Eric Dumazet <edumazet(a)google.com>
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
Eric Dumazet <edumazet(a)google.com>
net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - remove bad list_del
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: do not write to random DMA channel on BCM6345
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: correct clock usage
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
loop: remember whether sysfs_create_group() was done
Leon Romanovsky <leonro(a)mellanox.com>
RDMA/ucm: Mark UCM interface as BROKEN
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
PM / hibernate: Fix oops at snapshot_write()
Theodore Ts'o <tytso(a)mit.edu>
loop: add recursion validation to LOOP_CHANGE_FD
Florian Westphal <fw(a)strlen.de>
netfilter: x_tables: initialise match/target check parameter struct
Linus Torvalds <torvalds(a)linux-foundation.org>
Fix up non-directory creation in SGID directories
Dan Carpenter <dan.carpenter(a)oracle.com>
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
Nico Sneck <snecknico(a)gmail.com>
usb: quirks: add delay quirks for Corsair Strafe
Johan Hovold <johan(a)kernel.org>
USB: serial: mos7840: fix status-register error handling
Jann Horn <jannh(a)google.com>
USB: yurex: fix out-of-bounds uaccess in read handler
Johan Hovold <johan(a)kernel.org>
USB: serial: keyspan_pda: fix modem-status error handling
Jann Horn <jannh(a)google.com>
ibmasm: don't write out of bounds in read handler
-------------
Diffstat:
Makefile | 4 +-
drivers/block/loop.c | 79 +++++++++++++++------------
drivers/block/loop.h | 1 +
drivers/crypto/amcc/crypto4xx_core.c | 23 ++++----
drivers/infiniband/Kconfig | 12 ++++
drivers/infiniband/core/Makefile | 4 +-
drivers/misc/ibmasm/ibmasmfs.c | 27 +--------
drivers/net/ethernet/broadcom/bcm63xx_enet.c | 34 +++++++++---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 8 +--
drivers/net/ethernet/sun/sungem.c | 22 ++++----
drivers/usb/core/quirks.c | 4 ++
drivers/usb/host/xhci-mem.c | 2 +-
drivers/usb/misc/yurex.c | 23 ++------
drivers/usb/serial/keyspan_pda.c | 4 +-
drivers/usb/serial/mos7840.c | 3 +
drivers/vhost/net.c | 3 +-
fs/inode.c | 6 ++
kernel/power/user.c | 5 ++
net/bridge/netfilter/ebtables.c | 15 +++++
net/dccp/ccids/ccid3.c | 16 +++---
net/dns_resolver/dns_key.c | 28 ++++++----
net/ipv4/netfilter/ip_tables.c | 1 +
net/ipv4/sysctl_net_ipv4.c | 18 ++++--
net/ipv4/tcp_input.c | 9 +++
net/ipv6/netfilter/ip6_tables.c | 1 +
net/nfc/llcp_commands.c | 9 ++-
net/rds/loop.c | 1 +
net/rds/rds.h | 5 ++
net/rds/recv.c | 5 ++
net/sched/sch_blackhole.c | 2 +-
30 files changed, 228 insertions(+), 146 deletions(-)
Greetings,
I represent business group in Middle East looking for projects to
fund; we seek any business that will guaranty a safe and secure
return
on investments. Alternative powers, movies, start up companies
etc. We
are also looking for commercial building projects, hotels,
casino,
strip malls etc.
If you have a project that differs from these and current budget
is
over $40M, Write us and Provide Executive Summary and project
details.
(a)100% financing is available.
(b)Possible JV partnerships or 100% loans.
(c)Project must be over $20M total budget.
I look forward to your reply,
Sir Abraham Oded
Sahara Petrochems Ltd
odoabraham47(a)gmail.com
For some time now, if you load the bonding driver and configure bond
parameters via sysfs using minimal config options, such as specifying
nothing but the mode, relying on defaults for everything else, modes
that cannot use arp monitoring (802.3ad, balance-tlb, balance-alb) all
wind up with both arp_interval=0 (as it should be) and miimon=0, which
means the miimon monitor thread never actually runs. This is particularly
problematic for 802.3ad.
For example, from an LNST recipe I've set up:
$ modprobe bonding max_bonds=0"
$ echo "+t_bond0" > /sys/class/net/bonding_masters"
$ ip link set t_bond0 down"
$ echo "802.3ad" > /sys/class/net/t_bond0/bonding/mode"
$ ip link set ens1f1 down"
$ echo "+ens1f1" > /sys/class/net/t_bond0/bonding/slaves"
$ ip link set ens1f0 down"
$ echo "+ens1f0" > /sys/class/net/t_bond0/bonding/slaves"
$ ethtool -i t_bond0"
$ ip link set ens1f1 up"
$ ip link set ens1f0 up"
$ ip link set t_bond0 up"
$ ip addr add 192.168.9.1/24 dev t_bond0"
$ ip addr add 2002::1/64 dev t_bond0"
This bond comes up okay, but things look slightly suspect in
/proc/net/bonding/t_bond0 output:
$ grep -i mii /proc/net/bonding/t_bond0
MII Status: up
MII Polling Interval (ms): 0
MII Status: up
MII Status: up
Now, pull a cable on one of the ports in the bond, then reconnect it, and
you'll see:
Slave Interface: ens1f0
MII Status: down
Speed: 1000 Mbps
Duplex: full
I believe this became a major issue as of commit 4d2c0cda0744, which for
802.3ad bonds, sets slave->link = BOND_LINK_DOWN, with a comment about
relying on link monitoring via miimon to set it correctly, but since the
miimon work queue never runs, the link just stays marked down.
If we simply tweak bond_option_mode_set() slightly, we can check for the
non-arp modes having no miimon value set, and insert BOND_DEFAULT_MIIMON,
which gets things back in full working order. This problem exists as far
back as 4.14, and might be worth fixing in all stable trees since, though
the work-around is to simply specify an miimon value yourself.
Reported-by: Bob Ball <ball(a)umich.edu>
CC: Jay Vosburgh <j.vosburgh(a)gmail.com>
CC: Veaceslav Falico <vfalico(a)gmail.com>
CC: Andy Gospodarek <andy(a)greyhouse.net>
CC: Mahesh Bandewar <maheshb(a)google.com>
CC: David S. Miller <davem(a)davemloft.net>
CC: netdev(a)vger.kernel.org
CC: stable(a)vger.kernel.org
Signed-off-by: Jarod Wilson <jarod(a)redhat.com>
---
drivers/net/bonding/bond_options.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 98663c50ded0..4d5d01cb8141 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -743,15 +743,20 @@ const struct bond_option *bond_opt_get(unsigned int option)
static int bond_option_mode_set(struct bonding *bond,
const struct bond_opt_value *newval)
{
- if (!bond_mode_uses_arp(newval->value) && bond->params.arp_interval) {
- netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
- newval->string);
- /* disable arp monitoring */
- bond->params.arp_interval = 0;
- /* set miimon to default value */
- bond->params.miimon = BOND_DEFAULT_MIIMON;
- netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n",
- bond->params.miimon);
+ if (!bond_mode_uses_arp(newval->value)) {
+ if (bond->params.arp_interval) {
+ netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
+ newval->string);
+ /* disable arp monitoring */
+ bond->params.arp_interval = 0;
+ }
+
+ if (!bond->params.miimon) {
+ /* set miimon to default value */
+ bond->params.miimon = BOND_DEFAULT_MIIMON;
+ netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n",
+ bond->params.miimon);
+ }
}
if (newval->value == BOND_MODE_ALB)
--
2.16.1
This is the start of the stable review cycle for the 4.9.114 release.
There are 66 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Jul 22 12:13:47 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.114-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.114-rc1
Tejun Heo <tj(a)kernel.org>
string: drop __must_check from strscpy() and restore strscpy() usages in cgroup
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Add ARCH_WORKAROUND_2 support for guests
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Add HYP per-cpu accessors
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Add prctl interface for per-thread mitigation
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Introduce thread flag to control userspace mitigation
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Restore mitigation status on CPU resume
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Add global mitigation state accessor
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Add 'ssbd' command-line option
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Add ARCH_WORKAROUND_2 probing
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
Marc Zyngier <marc.zyngier(a)arm.com>
arm/arm64: smccc: Add SMCCC-specific return codes
Christoffer Dall <christoffer.dall(a)linaro.org>
KVM: arm64: Avoid storing the vcpu pointer on the stack
Marc Zyngier <marc.zyngier(a)arm.com>
KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: alternatives: Add dynamic patching feature
James Morse <james.morse(a)arm.com>
KVM: arm64: Stop save/restoring host tpidr_el1 on VHE
James Morse <james.morse(a)arm.com>
arm64: alternatives: use tpidr_el2 on VHE hosts
James Morse <james.morse(a)arm.com>
KVM: arm64: Change hyp_panic()s dependency on tpidr_el2
James Morse <james.morse(a)arm.com>
KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu allocation
James Morse <james.morse(a)arm.com>
KVM: arm64: Store vcpu on the stack during __guest_enter()
Mark Rutland <mark.rutland(a)arm.com>
arm64: assembler: introduce ldr_this_cpu
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
Santosh Shilimkar <santosh.shilimkar(a)oracle.com>
rds: avoid unenecessary cong_update in loop transport
Florian Westphal <fw(a)strlen.de>
netfilter: ipv6: nf_defrag: drop skb dst before queueing
Eric Biggers <ebiggers(a)google.com>
KEYS: DNS: fix parsing multiple options
Eric Biggers <ebiggers(a)google.com>
reiserfs: fix buffer overflow with long warning messages
Florian Westphal <fw(a)strlen.de>
netfilter: ebtables: reject non-bridge targets
Stefan Wahren <stefan.wahren(a)i2se.com>
net: lan78xx: Fix race in tx pending skb size calculation
Ping-Ke Shih <pkshih(a)realtek.com>
rtlwifi: rtl8821ae: fix firmware is not ready to run
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
net: cxgb3_main: fix potential Spectre v1
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix command interface race in polling mode
Eric Dumazet <edumazet(a)google.com>
net/packet: fix use-after-free
Jason Wang <jasowang(a)redhat.com>
vhost_net: validate sock before trying to put its fd
Ilpo Järvinen <ilpo.jarvinen(a)helsinki.fi>
tcp: prevent bogus FRTO undos with non-SACK flows
Yuchung Cheng <ycheng(a)google.com>
tcp: fix Fast Open key endianness
Jiri Slaby <jslaby(a)suse.cz>
r8152: napi hangup fix after disconnect
Aleksander Morgado <aleksander(a)aleksander.es>
qmi_wwan: add support for the Dell Wireless 5821e module
Sudarsana Reddy Kalluru <sudarsana.kalluru(a)cavium.com>
qed: Limit msix vectors in kdump kernel to the minimum required count.
Sudarsana Reddy Kalluru <sudarsana.kalluru(a)cavium.com>
qed: Fix use of incorrect size in memcpy call.
Eric Dumazet <edumazet(a)google.com>
net: sungem: fix rx checksum support
Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
net_sched: blackhole: tell upper qdisc about dropped packets
Shay Agroskin <shayag(a)mellanox.com>
net/mlx5: Fix wrong size allocation for QoS ETC TC regitster
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix incorrect raw command length parsing
Eric Dumazet <edumazet(a)google.com>
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
Eric Dumazet <edumazet(a)google.com>
net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
Xin Long <lucien.xin(a)gmail.com>
ipvlan: fix IFLA_MTU ignored on NEWLINK
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
atm: zatm: Fix potential Spectre v1
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - remove bad list_del
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: do not write to random DMA channel on BCM6345
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: correct clock usage
Jonas Gorski <jonas.gorski(a)gmail.com>
spi/bcm63xx: fix typo in bcm63xx_spi_max_length breaking compilation
Jonas Gorski <jonas.gorski(a)gmail.com>
spi/bcm63xx: make spi subsystem aware of message size limits
Heiner Kallweit <hkallweit1(a)gmail.com>
mtd: m25p80: consider max message size in m25p80_read
alex chen <alex.chen(a)huawei.com>
ocfs2: ip_alloc_sem should be taken in ocfs2_get_block()
alex chen <alex.chen(a)huawei.com>
ocfs2: subsystem.su_mutex is required while accessing the item->ci_parent
Nick Desaulniers <ndesaulniers(a)google.com>
x86/paravirt: Make native_save_fl() extern inline
H. Peter Anvin <hpa(a)linux.intel.com>
x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>
Nick Desaulniers <ndesaulniers(a)google.com>
compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations
David Rientjes <rientjes(a)google.com>
compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabled
Linus Torvalds <torvalds(a)linux-foundation.org>
compiler, clang: properly override 'inline' for clang
David Rientjes <rientjes(a)google.com>
compiler, clang: suppress warning for unused static inline functions
Paul Burton <paul.burton(a)mips.com>
MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
-------------
Diffstat:
Documentation/kernel-parameters.txt | 17 +++
Makefile | 4 +-
arch/arm/include/asm/kvm_host.h | 12 ++
arch/arm/include/asm/kvm_mmu.h | 12 ++
arch/arm/kvm/arm.c | 24 ++--
arch/arm/kvm/psci.c | 18 ++-
arch/arm64/Kconfig | 9 ++
arch/arm64/include/asm/alternative.h | 43 +++++-
arch/arm64/include/asm/assembler.h | 27 +++-
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/cpufeature.h | 22 +++
arch/arm64/include/asm/kvm_asm.h | 41 ++++++
arch/arm64/include/asm/kvm_host.h | 43 ++++++
arch/arm64/include/asm/kvm_mmu.h | 44 ++++++
arch/arm64/include/asm/percpu.h | 12 +-
arch/arm64/include/asm/thread_info.h | 1 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/alternative.c | 54 ++++---
arch/arm64/kernel/asm-offsets.c | 2 +
arch/arm64/kernel/cpu_errata.c | 180 ++++++++++++++++++++++++
arch/arm64/kernel/cpufeature.c | 17 +++
arch/arm64/kernel/entry.S | 32 ++++-
arch/arm64/kernel/hibernate.c | 11 ++
arch/arm64/kernel/ssbd.c | 108 ++++++++++++++
arch/arm64/kernel/suspend.c | 8 ++
arch/arm64/kvm/hyp-init.S | 4 +
arch/arm64/kvm/hyp/entry.S | 12 +-
arch/arm64/kvm/hyp/hyp-entry.S | 62 ++++++--
arch/arm64/kvm/hyp/switch.c | 64 +++++++--
arch/arm64/kvm/hyp/sysreg-sr.c | 21 +--
arch/arm64/kvm/reset.c | 4 +
arch/mips/kernel/process.c | 45 ++++--
arch/x86/include/asm/asm.h | 59 ++++++++
arch/x86/include/asm/irqflags.h | 2 +-
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/irqflags.S | 26 ++++
drivers/atm/zatm.c | 2 +
drivers/crypto/amcc/crypto4xx_core.c | 23 ++-
drivers/mtd/devices/m25p80.c | 3 +-
drivers/net/ethernet/broadcom/bcm63xx_enet.c | 34 +++--
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 2 +
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 8 +-
drivers/net/ethernet/mellanox/mlx5/core/port.c | 4 +-
drivers/net/ethernet/qlogic/qed/qed_dcbx.c | 8 +-
drivers/net/ethernet/qlogic/qed/qed_main.c | 9 ++
drivers/net/ethernet/sun/sungem.c | 22 +--
drivers/net/ipvlan/ipvlan_main.c | 3 +-
drivers/net/usb/lan78xx.c | 5 +-
drivers/net/usb/qmi_wwan.c | 1 +
drivers/net/usb/r8152.c | 3 +-
drivers/net/wireless/realtek/rtlwifi/core.c | 1 -
drivers/spi/spi-bcm63xx.c | 9 ++
drivers/vhost/net.c | 3 +-
fs/ocfs2/aops.c | 26 ++--
fs/ocfs2/cluster/nodemanager.c | 63 +++++++--
fs/reiserfs/prints.c | 141 +++++++++++--------
include/linux/arm-smccc.h | 10 ++
include/linux/compiler-gcc.h | 35 +++--
include/linux/string.h | 2 +-
net/bridge/netfilter/ebtables.c | 13 ++
net/dccp/ccids/ccid3.c | 16 ++-
net/dns_resolver/dns_key.c | 28 ++--
net/ipv4/sysctl_net_ipv4.c | 18 ++-
net/ipv4/tcp_input.c | 9 ++
net/ipv6/netfilter/nf_conntrack_reasm.c | 2 +
net/nfc/llcp_commands.c | 9 +-
net/packet/af_packet.c | 14 +-
net/rds/loop.c | 1 +
net/rds/rds.h | 5 +
net/rds/recv.c | 5 +
net/sched/sch_blackhole.c | 2 +-
virt/kvm/arm/hyp/vgic-v2-sr.c | 2 +-
72 files changed, 1315 insertions(+), 271 deletions(-)
This is a note to let you know that I've just added the patch titled
pty: fix O_CLOEXEC for TIOCGPTPEER
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 36ecc1481dc8d8c52d43ba18c6b642c1d2fde789 Mon Sep 17 00:00:00 2001
From: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Date: Thu, 19 Jul 2018 10:43:46 +0200
Subject: pty: fix O_CLOEXEC for TIOCGPTPEER
It was being ignored because the flags were not passed to fd allocation.
Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Acked-by: Aleksa Sarai <asarai(a)suse.de>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/pty.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index b0e2c4847a5d..678406e0948b 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -625,7 +625,7 @@ int ptm_open_peer(struct file *master, struct tty_struct *tty, int flags)
if (tty->driver != ptm_driver)
return -EIO;
- fd = get_unused_fd_flags(0);
+ fd = get_unused_fd_flags(flags);
if (fd < 0) {
retval = fd;
goto err;
--
2.18.0
This is a note to let you know that I've just added the patch titled
uio: fix wrong return value from uio_mmap()
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From e7de2590f18a272e63732b9d519250d1b522b2c4 Mon Sep 17 00:00:00 2001
From: Hailong Liu <liu.hailong6(a)zte.com.cn>
Date: Fri, 20 Jul 2018 08:31:56 +0800
Subject: uio: fix wrong return value from uio_mmap()
uio_mmap has multiple fail paths to set return value to nonzero then
goto out. However, it always returns *0* from the *out* at end, and
this will mislead callers who check the return value of this function.
Fixes: 57c5f4df0a5a0ee ("uio: fix crash after the device is unregistered")
CC: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: Hailong Liu <liu.hailong6(a)zte.com.cn>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Jiang Biao <jiang.biao2(a)zte.com.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/uio/uio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index f63967c8e95a..144cf7365288 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -813,7 +813,7 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
out:
mutex_unlock(&idev->info_lock);
- return 0;
+ return ret;
}
static const struct file_operations uio_fops = {
--
2.18.0
This is a note to let you know that I've just added the patch titled
pty: fix O_CLOEXEC for TIOCGPTPEER
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 36ecc1481dc8d8c52d43ba18c6b642c1d2fde789 Mon Sep 17 00:00:00 2001
From: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Date: Thu, 19 Jul 2018 10:43:46 +0200
Subject: pty: fix O_CLOEXEC for TIOCGPTPEER
It was being ignored because the flags were not passed to fd allocation.
Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Acked-by: Aleksa Sarai <asarai(a)suse.de>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/pty.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index b0e2c4847a5d..678406e0948b 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -625,7 +625,7 @@ int ptm_open_peer(struct file *master, struct tty_struct *tty, int flags)
if (tty->driver != ptm_driver)
return -EIO;
- fd = get_unused_fd_flags(0);
+ fd = get_unused_fd_flags(flags);
if (fd < 0) {
retval = fd;
goto err;
--
2.18.0
This is a note to let you know that I've just added the patch titled
uio: fix wrong return value from uio_mmap()
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From e7de2590f18a272e63732b9d519250d1b522b2c4 Mon Sep 17 00:00:00 2001
From: Hailong Liu <liu.hailong6(a)zte.com.cn>
Date: Fri, 20 Jul 2018 08:31:56 +0800
Subject: uio: fix wrong return value from uio_mmap()
uio_mmap has multiple fail paths to set return value to nonzero then
goto out. However, it always returns *0* from the *out* at end, and
this will mislead callers who check the return value of this function.
Fixes: 57c5f4df0a5a0ee ("uio: fix crash after the device is unregistered")
CC: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: Hailong Liu <liu.hailong6(a)zte.com.cn>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Jiang Biao <jiang.biao2(a)zte.com.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/uio/uio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index f63967c8e95a..144cf7365288 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -813,7 +813,7 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
out:
mutex_unlock(&idev->info_lock);
- return 0;
+ return ret;
}
static const struct file_operations uio_fops = {
--
2.18.0
This is a note to let you know that I've just added the patch titled
usb: core: handle hub C_PORT_OVER_CURRENT condition
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 249a32b7eeb3edb6897dd38f89651a62163ac4ed Mon Sep 17 00:00:00 2001
From: Bin Liu <b-liu(a)ti.com>
Date: Thu, 19 Jul 2018 14:39:37 -0500
Subject: usb: core: handle hub C_PORT_OVER_CURRENT condition
Based on USB2.0 Spec Section 11.12.5,
"If a hub has per-port power switching and per-port current limiting,
an over-current on one port may still cause the power on another port
to fall below specific minimums. In this case, the affected port is
placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the
port, but PORT_OVER_CURRENT is not set."
so let's check C_PORT_OVER_CURRENT too for over current condition.
Fixes: 08d1dec6f405 ("usb:hub set hub->change_bits when over-current happens")
Cc: <stable(a)vger.kernel.org>
Tested-by: Alessandro Antenucci <antenucci(a)korg.it>
Signed-off-by: Bin Liu <b-liu(a)ti.com>
Acked-by: Alan Stern <stern(a)rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/hub.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index fcae521df29b..1fb266809966 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1142,10 +1142,14 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
if (!udev || udev->state == USB_STATE_NOTATTACHED) {
/* Tell hub_wq to disconnect the device or
- * check for a new connection
+ * check for a new connection or over current condition.
+ * Based on USB2.0 Spec Section 11.12.5,
+ * C_PORT_OVER_CURRENT could be set while
+ * PORT_OVER_CURRENT is not. So check for any of them.
*/
if (udev || (portstatus & USB_PORT_STAT_CONNECTION) ||
- (portstatus & USB_PORT_STAT_OVERCURRENT))
+ (portstatus & USB_PORT_STAT_OVERCURRENT) ||
+ (portchange & USB_PORT_STAT_C_OVERCURRENT))
set_bit(port1, hub->change_bits);
} else if (portstatus & USB_PORT_STAT_ENABLE) {
--
2.18.0
This is a note to let you know that I've just added the patch titled
USB: serial: sierra: fix potential deadlock at close
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From e60870012e5a35b1506d7b376fddfb30e9da0b27 Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness(a)linutronix.de>
Date: Sun, 24 Jun 2018 00:32:11 +0200
Subject: USB: serial: sierra: fix potential deadlock at close
The portdata spinlock can be taken in interrupt context (via
sierra_outdat_callback()).
Disable interrupts when taking the portdata spinlock when discarding
deferred URBs during close to prevent a possible deadlock.
Fixes: 014333f77c0b ("USB: sierra: fix urb and memory leak on disconnect")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: John Ogness <john.ogness(a)linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
[ johan: amend commit message and add fixes and stable tags ]
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/sierra.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
index d189f953c891..55956a638f5b 100644
--- a/drivers/usb/serial/sierra.c
+++ b/drivers/usb/serial/sierra.c
@@ -770,9 +770,9 @@ static void sierra_close(struct usb_serial_port *port)
kfree(urb->transfer_buffer);
usb_free_urb(urb);
usb_autopm_put_interface_async(serial->interface);
- spin_lock(&portdata->lock);
+ spin_lock_irq(&portdata->lock);
portdata->outstanding_urbs--;
- spin_unlock(&portdata->lock);
+ spin_unlock_irq(&portdata->lock);
}
sierra_stop_rx_urbs(port);
--
2.18.0
From: Jing Xia <jing.xia.mail(a)gmail.com>
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
It was reported that a kernel crash happened in mem_cgroup_iter(), which
can be triggered if the legacy cgroup-v1 non-hierarchical mode is used.
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
......
Call trace:
mem_cgroup_iter+0x2e0/0x6d4
shrink_zone+0x8c/0x324
balance_pgdat+0x450/0x640
kswapd+0x130/0x4b8
kthread+0xe8/0xfc
ret_from_fork+0x10/0x20
mem_cgroup_iter():
......
if (css_tryget(css)) <-- crash here
break;
......
The crashing reason is that mem_cgroup_iter() uses the memcg object whose
pointer is stored in iter->position, which has been freed before and
filled with POISON_FREE(0x6b).
And the root cause of the use-after-free issue is that
invalidate_reclaim_iterators() fails to reset the value of iter->position
to NULL when the css of the memcg is released in non- hierarchical mode.
Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.…
Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
Signed-off-by: Jing Xia <jing.xia.mail(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: <chunyan.zhang(a)unisoc.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter mm/memcontrol.c
--- a/mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter
+++ a/mm/memcontrol.c
@@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators
int nid;
int i;
- while ((memcg = parent_mem_cgroup(memcg))) {
+ for (; memcg; memcg = parent_mem_cgroup(memcg)) {
for_each_node(nid) {
mz = mem_cgroup_nodeinfo(memcg, nid);
for (i = 0; i <= DEF_PRIORITY; i++) {
_
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/huge_memory.c: fix data loss when splitting a file pmd
__split_huge_pmd_locked() must check if the cleared huge pmd was dirty,
and propagate that to PageDirty: otherwise, data may be lost when a huge
tmpfs page is modified then split then reclaimed.
How has this taken so long to be noticed? Because there was no problem
when the huge page is written by a write system call (shmem_write_end()
calls set_page_dirty()), nor when the page is allocated for a write fault
(fault_dirty_shared_page() calls set_page_dirty()); but when allocated for
a read fault (which MAP_POPULATE simulates), no set_page_dirty().
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils
Fixes: d21b9e57c74c ("thp: handle file pages in split_huge_pmd()")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Ashwin Chaugule <ashwinch(a)google.com>
Reviewed-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 2 ++
1 file changed, 2 insertions(+)
diff -puN mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd mm/huge_memory.c
--- a/mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd
+++ a/mm/huge_memory.c
@@ -2084,6 +2084,8 @@ static void __split_huge_pmd_locked(stru
if (vma_is_dax(vma))
return;
page = pmd_page(_pmd);
+ if (!PageDirty(page) && pmd_dirty(_pmd))
+ set_page_dirty(page);
if (!PageReferenced(page) && pmd_young(_pmd))
SetPageReferenced(page);
page_remove_rmap(page, true);
_
This is a note to let you know that I've just added the patch titled
USB: serial: sierra: fix potential deadlock at close
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the usb-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From e60870012e5a35b1506d7b376fddfb30e9da0b27 Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness(a)linutronix.de>
Date: Sun, 24 Jun 2018 00:32:11 +0200
Subject: USB: serial: sierra: fix potential deadlock at close
The portdata spinlock can be taken in interrupt context (via
sierra_outdat_callback()).
Disable interrupts when taking the portdata spinlock when discarding
deferred URBs during close to prevent a possible deadlock.
Fixes: 014333f77c0b ("USB: sierra: fix urb and memory leak on disconnect")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: John Ogness <john.ogness(a)linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
[ johan: amend commit message and add fixes and stable tags ]
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/sierra.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
index d189f953c891..55956a638f5b 100644
--- a/drivers/usb/serial/sierra.c
+++ b/drivers/usb/serial/sierra.c
@@ -770,9 +770,9 @@ static void sierra_close(struct usb_serial_port *port)
kfree(urb->transfer_buffer);
usb_free_urb(urb);
usb_autopm_put_interface_async(serial->interface);
- spin_lock(&portdata->lock);
+ spin_lock_irq(&portdata->lock);
portdata->outstanding_urbs--;
- spin_unlock(&portdata->lock);
+ spin_unlock_irq(&portdata->lock);
}
sierra_stop_rx_urbs(port);
--
2.18.0
This is a note to let you know that I've just added the patch titled
usb: xhci: Fix memory leak in xhci_endpoint_reset()
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From d89b7664f76047e7beca8f07e86f2ccfad085a28 Mon Sep 17 00:00:00 2001
From: Zheng Xiaowei <zhengxiaowei(a)ruijie.com.cn>
Date: Fri, 20 Jul 2018 18:05:11 +0300
Subject: usb: xhci: Fix memory leak in xhci_endpoint_reset()
If td_list is not empty the cfg_cmd will not be freed,
call xhci_free_command to free it.
Signed-off-by: Zheng Xiaowei <zhengxiaowei(a)ruijie.com.cn>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 2f4850f25e82..68e6132aa8b2 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3051,6 +3051,7 @@ static void xhci_endpoint_reset(struct usb_hcd *hcd,
if (!list_empty(&ep->ring->td_list)) {
dev_err(&udev->dev, "EP not empty, refuse reset\n");
spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_free_command(xhci, cfg_cmd);
goto cleanup;
}
xhci_queue_stop_endpoint(xhci, stop_cmd, udev->slot_id, ep_index, 0);
--
2.18.0
Based on USB2.0 Spec Section 11.12.5,
"If a hub has per-port power switching and per-port current limiting,
an over-current on one port may still cause the power on another port
to fall below specific minimums. In this case, the affected port is
placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the
port, but PORT_OVER_CURRENT is not set."
so let's check C_PORT_OVER_CURRENT too for over current condition.
Fixes: 08d1dec6f405 ("usb:hub set hub->change_bits when over-current happens")
Cc: <stable(a)vger.kernel.org>
Tested-by: Alessandro Antenucci <antenucci(a)korg.it>
Signed-off-by: Bin Liu <b-liu(a)ti.com>
---
drivers/usb/core/hub.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index fcae521df29b..1fb266809966 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1142,10 +1142,14 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
if (!udev || udev->state == USB_STATE_NOTATTACHED) {
/* Tell hub_wq to disconnect the device or
- * check for a new connection
+ * check for a new connection or over current condition.
+ * Based on USB2.0 Spec Section 11.12.5,
+ * C_PORT_OVER_CURRENT could be set while
+ * PORT_OVER_CURRENT is not. So check for any of them.
*/
if (udev || (portstatus & USB_PORT_STAT_CONNECTION) ||
- (portstatus & USB_PORT_STAT_OVERCURRENT))
+ (portstatus & USB_PORT_STAT_OVERCURRENT) ||
+ (portchange & USB_PORT_STAT_C_OVERCURRENT))
set_bit(port1, hub->change_bits);
} else if (portstatus & USB_PORT_STAT_ENABLE) {
--
1.9.1
This is a note to let you know that I've just added the patch titled
usb: gadget: f_fs: Only return delayed status when len is 0
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 4d644abf25698362bd33d17c9ddc8f7122c30f17 Mon Sep 17 00:00:00 2001
From: Jerry Zhang <zhangjerry(a)google.com>
Date: Mon, 2 Jul 2018 12:48:08 -0700
Subject: usb: gadget: f_fs: Only return delayed status when len is 0
Commit 1b9ba000 ("Allow function drivers to pause control
transfers") states that USB_GADGET_DELAYED_STATUS is only
supported if data phase is 0 bytes.
It seems that when the length is not 0 bytes, there is no
need to explicitly delay the data stage since the transfer
is not completed until the user responds. However, when the
length is 0, there is no data stage and the transfer is
finished once setup() returns, hence there is a need to
explicitly delay completion.
This manifests as the following bugs:
Prior to 946ef68ad4e4 ('Let setup() return
USB_GADGET_DELAYED_STATUS'), when setup is 0 bytes, ffs
would require user to queue a 0 byte request in order to
clear setup state. However, that 0 byte request was actually
not needed and would hang and cause errors in other setup
requests.
After the above commit, 0 byte setups work since the gadget
now accepts empty queues to ep0 to clear the delay, but all
other setups hang.
Fixes: 946ef68ad4e4 ("Let setup() return USB_GADGET_DELAYED_STATUS")
Signed-off-by: Jerry Zhang <zhangjerry(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Felipe Balbi <felipe.balbi(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/gadget/function/f_fs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 33e2030503fa..3ada83d81bda 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -3263,7 +3263,7 @@ static int ffs_func_setup(struct usb_function *f,
__ffs_event_add(ffs, FUNCTIONFS_SETUP);
spin_unlock_irqrestore(&ffs->ev.waitq.lock, flags);
- return USB_GADGET_DELAYED_STATUS;
+ return creq->wLength == 0 ? USB_GADGET_DELAYED_STATUS : 0;
}
static bool ffs_func_req_match(struct usb_function *f,
--
2.18.0
On Fri, Jul 20, 2018 at 10:00:34AM +0200, Dmitry Vyukov wrote:
> On Fri, Jul 20, 2018 at 9:53 AM, James Chapman <jchapman(a)katalix.com> wrote:
> > On 18/07/18 12:00, Dmitry Vyukov wrote:
> >> On Tue, Jan 16, 2018 at 7:29 PM, syzbot
> >> <syzbot+065d0fc357520c8f6039(a)syzkaller.appspotmail.com> wrote:
> >>> Hello,
> >>>
> >>> syzkaller hit the following crash on
> >>> a8750ddca918032d6349adbf9a4b6555e7db20da
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> >>> compiler: gcc (GCC) 7.1.1 20170620
> >>> .config is attached
> >>> Raw console output is attached.
> >>> Unfortunately, I don't have any reproducer for this bug yet.
> >>>
> >>>
> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>> Reported-by: syzbot+065d0fc357520c8f6039(a)syzkaller.appspotmail.com
> >>> It will help syzbot understand when the bug is fixed. See footer for
> >>> details.
> >>> If you forward the report, please keep this part and the footer.
> >>
> >> James,
> >>
> >> Did you fix this? You asked syzbot to test a fix for this bug some time ago.
> >> If yes, did you include the Reported-by tag in the commit? This bug is
> >> still considered open by syzbot. But it stopped happening ~4 months
> >> ago:
> >
> > Yes, I think this has been fixed now. I think it was fixed by
> > Guillaume's 6b9f34239b00e6956a267abed2bc559ede556ad6 that was actually
> > to fix another syzbot bug fbeeb5c3b538e8545644 which looks similar to
> > this one.
> >
> >> https://syzkaller.appspot.com/bug?id=6fed0854381422329e78d7e16fb9cf4af8c9ae…
> >> We are also seeing these crashes in 4.4 and 4.9, it would be good to
> >> backport the fix.
> >
> > It looks like 6b9f34239b00e6956a267abed2bc559ede556ad6 hasn't made it to
> > 4.9 or 4.4.
>
> Thanks for the update!
>
> Let's tell syzbot that this is fixed:
>
> #syz fix: l2tp: fix races in tunnel creation
>
> Greg H: so this is probably the patch we need.
>
> +Greg KH: I think we need this in stable, we hit this in both 4.4 and 4.9.
It's also needed in 4.14.y. But it doesn't apply to any of those kernel
trees cleanly, can someone please provide a working backport?
thanks,
greg k-h
Sehr geehrte Damen und Herren,
Nach der Analyse Ihrer Internetseite haben wir Fehler im Code ermittelt, die einen groβen Einfluss darauf haben, dass Ihre Webseite eine niedrige Position in den Suchmaschinen,
darunter auch in der wichtigsten, d. h. auf Google, einnimmt.
Wir bieten Ihnen die Optimierung Ihrer Webseite ohne Abonnement, ohne monatliche Gebühren, ohne versteckte Kosten und ohne Strafen für den Vertragsbruch an.
Die Optimierung Ihrer Internetseite wird für die bessere Suchmaschinenfreundlichkeit sorgen und dadurch wird auch ihre Position im Google-Ranking und in anderen Suchmaschinen steigen.
***
Aktionspreis: 240 € (inkl. MwSt.) - bis zum 20.07.2018
***
Mehr Informationen finden Sie auf unserer Internetseite:
http://www.web-suchmaschinenoptimierung.net
Mit freundlichen Grüßen
Martin Schoch
Web Analytics
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f6e6986045d47f87bd982910821b7ab9758487e Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Date: Sat, 23 Jun 2018 01:06:34 +0900
Subject: [PATCH] mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz
unconditionally
Since commit 1bb88666775e ("mtd: nand: denali: handle timing parameters
by setup_data_interface()"), denali_dt.c gets the clock rate from the
clock driver. The driver expects the frequency of the bus interface
clock, whereas the clock driver of SOCFPGA provides the core clock.
Thus, the setup_data_interface() hook calculates timing parameters
based on a wrong frequency.
To make it work without relying on the clock driver, hard-code the clock
frequency, 200MHz. This is fine for existing DT of UniPhier, and also
fixes the issue of SOCFPGA because both platforms use 200 MHz for the
bus interface clock.
Fixes: 1bb88666775e ("mtd: nand: denali: handle timing parameters by setup_data_interface()")
Cc: linux-stable <stable(a)vger.kernel.org> #4.14+
Reported-by: Philipp Rosenberger <p.rosenberger(a)linutronix.de>
Suggested-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Tested-by: Richard Weinberger <richard(a)nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
diff --git a/drivers/mtd/nand/raw/denali_dt.c b/drivers/mtd/nand/raw/denali_dt.c
index cfd33e6ca77f..5869e90cc14b 100644
--- a/drivers/mtd/nand/raw/denali_dt.c
+++ b/drivers/mtd/nand/raw/denali_dt.c
@@ -123,7 +123,11 @@ static int denali_dt_probe(struct platform_device *pdev)
if (ret)
return ret;
- denali->clk_x_rate = clk_get_rate(dt->clk);
+ /*
+ * Hardcode the clock rate for the backward compatibility.
+ * This works for both SOCFPGA and UniPhier.
+ */
+ denali->clk_x_rate = 200000000;
ret = denali_init(denali);
if (ret)
The patch titled
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
has been added to the -mm tree. Its filename is
mm-memcg-fix-use-after-free-in-mem_cgroup_iter.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-fix-use-after-free-in-mem…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-fix-use-after-free-in-mem…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Jing Xia <jing.xia.mail(a)gmail.com>
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
It was reported that a kernel crash happened in mem_cgroup_iter(), which
can be triggered if the legacy cgroup-v1 non-hierarchical mode is used.
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
......
Call trace:
mem_cgroup_iter+0x2e0/0x6d4
shrink_zone+0x8c/0x324
balance_pgdat+0x450/0x640
kswapd+0x130/0x4b8
kthread+0xe8/0xfc
ret_from_fork+0x10/0x20
mem_cgroup_iter():
......
if (css_tryget(css)) <-- crash here
break;
......
The crashing reason is that mem_cgroup_iter() uses the memcg object whose
pointer is stored in iter->position, which has been freed before and
filled with POISON_FREE(0x6b).
And the root cause of the use-after-free issue is that
invalidate_reclaim_iterators() fails to reset the value of iter->position
to NULL when the css of the memcg is released in non- hierarchical mode.
Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.…
Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
Signed-off-by: Jing Xia <jing.xia.mail(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: <chunyan.zhang(a)unisoc.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
diff -puN mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter mm/memcontrol.c
--- a/mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter
+++ a/mm/memcontrol.c
@@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators
int nid;
int i;
- while ((memcg = parent_mem_cgroup(memcg))) {
+ for (; memcg; memcg = parent_mem_cgroup(memcg)) {
for_each_node(nid) {
mz = mem_cgroup_nodeinfo(memcg, nid);
for (i = 0; i <= DEF_PRIORITY; i++) {
_
Patches currently in -mm which might be from jing.xia.mail(a)gmail.com are
mm-memcg-fix-use-after-free-in-mem_cgroup_iter.patch
A report from Colin Ian King pointed a CoverityScan issue where error
values on these helpers where not checked in the drivers. These
helpers can error out only in case of a software bug in driver code,
not because of a runtime/hardware error. Hence, let's WARN_ON() in this
case and return 0 which is harmless anyway.
Fixes: 8878b126df76 ("mtd: nand: add ->exec_op() implementation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
Changes since v1:
=================
* At first I decided to continue returning negative errors and
handling these cases in the drivers. Not sure this was the right thing
to do as reported by Boris so now the core WARN_ON() on error (only
due to some bug in a controller driver) and return an harmless
value. The drivers are not touched anymore, hence this patch is alone
now.
drivers/mtd/nand/raw/nand_base.c | 44 ++++++++++++++++++++--------------------
include/linux/mtd/rawnand.h | 16 +++++++--------
2 files changed, 30 insertions(+), 30 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 4fa5e20d9690..9bb76ddff4be 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -2668,8 +2668,8 @@ static bool nand_subop_instr_is_valid(const struct nand_subop *subop,
return subop && instr_idx < subop->ninstrs;
}
-static int nand_subop_get_start_off(const struct nand_subop *subop,
- unsigned int instr_idx)
+static unsigned int nand_subop_get_start_off(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
if (instr_idx)
return 0;
@@ -2688,12 +2688,12 @@ static int nand_subop_get_start_off(const struct nand_subop *subop,
*
* Given an address instruction, returns the offset of the first cycle to issue.
*/
-int nand_subop_get_addr_start_off(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_addr_start_off(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR)
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR))
+ return 0;
return nand_subop_get_start_off(subop, instr_idx);
}
@@ -2710,14 +2710,14 @@ EXPORT_SYMBOL_GPL(nand_subop_get_addr_start_off);
*
* Given an address instruction, returns the number of address cycle to issue.
*/
-int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
int start_off, end_off;
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR)
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR))
+ return 0;
start_off = nand_subop_get_addr_start_off(subop, instr_idx);
@@ -2742,12 +2742,12 @@ EXPORT_SYMBOL_GPL(nand_subop_get_num_addr_cyc);
*
* Given a data instruction, returns the offset to start from.
*/
-int nand_subop_get_data_start_off(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_data_start_off(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- !nand_instr_is_data(&subop->instrs[instr_idx]))
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ !nand_instr_is_data(&subop->instrs[instr_idx])))
+ return 0;
return nand_subop_get_start_off(subop, instr_idx);
}
@@ -2764,14 +2764,14 @@ EXPORT_SYMBOL_GPL(nand_subop_get_data_start_off);
*
* Returns the length of the chunk of data to send/receive.
*/
-int nand_subop_get_data_len(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_data_len(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
int start_off = 0, end_off;
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- !nand_instr_is_data(&subop->instrs[instr_idx]))
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ !nand_instr_is_data(&subop->instrs[instr_idx])))
+ return 0;
start_off = nand_subop_get_data_start_off(subop, instr_idx);
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index e383c7f32574..876a9dd47e74 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -1007,14 +1007,14 @@ struct nand_subop {
unsigned int last_instr_end_off;
};
-int nand_subop_get_addr_start_off(const struct nand_subop *subop,
- unsigned int op_id);
-int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
- unsigned int op_id);
-int nand_subop_get_data_start_off(const struct nand_subop *subop,
- unsigned int op_id);
-int nand_subop_get_data_len(const struct nand_subop *subop,
- unsigned int op_id);
+unsigned int nand_subop_get_addr_start_off(const struct nand_subop *subop,
+ unsigned int op_id);
+unsigned int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
+ unsigned int op_id);
+unsigned int nand_subop_get_data_start_off(const struct nand_subop *subop,
+ unsigned int op_id);
+unsigned int nand_subop_get_data_len(const struct nand_subop *subop,
+ unsigned int op_id);
/**
* struct nand_op_parser_addr_constraints - Constraints for address instructions
--
2.14.1
Tree/Branch: v4.4.142
Git describe: v4.4.142
Commit: ecb9989751 Linux 4.4.142
Build Time: 63 min 12 sec
Passed: 10 / 10 (100.00 %)
Failed: 0 / 10 ( 0.00 %)
Errors: 0
Warnings: 31
Section Mismatches: 0
-------------------------------------------------------------------------------
defconfigs with issues (other than build errors):
19 warnings 0 mismatches : arm64-allmodconfig
17 warnings 0 mismatches : x86_64-allmodconfig
-------------------------------------------------------------------------------
Warnings Summary: 31
3 warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
2 ../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
===============================================================================
Detailed per-defconfig build reports below:
-------------------------------------------------------------------------------
arm64-allmodconfig : PASS, 0 errors, 19 warnings, 0 section mismatches
Warnings:
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
x86_64-allmodconfig : PASS, 0 errors, 17 warnings, 0 section mismatches
Warnings:
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
Passed with no errors, warnings or mismatches:
arm64-allnoconfig
arm-multi_v5_defconfig
arm-multi_v7_defconfig
x86_64-defconfig
arm-allmodconfig
arm-allnoconfig
x86_64-allnoconfig
arm64-defconfig
From: Paul Burton <paul.burton(a)mips.com>
commit 5a267832c2ec47b2dad0fdb291a96bb5b8869315 upstream.
The generic nmi_cpu_backtrace() function calls show_regs() when a struct
pt_regs is available, and dump_stack() otherwise. If we were to make use
of the generic nmi_cpu_backtrace() with MIPS' current implementation of
show_regs() this would mean that we see only register data with no
accompanying stack information, in contrast with our current
implementation which calls dump_stack() regardless of whether register
state is available.
In preparation for making use of the generic nmi_cpu_backtrace() to
implement arch_trigger_cpumask_backtrace(), have our implementation of
show_regs() call dump_stack() and drop the explicit dump_stack() call in
arch_dump_stack() which is invoked by arch_trigger_cpumask_backtrace().
This will allow the output we produce to remain the same after a later
patch switches to using nmi_cpu_backtrace(). It may mean that we produce
extra stack output in other uses of show_regs(), but this:
1) Seems harmless.
2) Is good for consistency between arch_trigger_cpumask_backtrace()
and other users of show_regs().
3) Matches the behaviour of the ARM & PowerPC architectures.
Marked for stable back to v4.9 as a prerequisite of the following patch
"MIPS: Call dump_stack() from show_regs()".
Signed-off-by: Paul Burton <paul.burton(a)mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19596/
Cc: James Hogan <jhogan(a)kernel.org>
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: Huacai Chen <chenhc(a)lemote.com>
Cc: linux-mips(a)linux-mips.org
Cc: stable(a)vger.kernel.org
[ Huacai: backported to 4.4: The next patch is also backported to 4.4 ]
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/mips/kernel/process.c | 4 ++--
arch/mips/kernel/traps.c | 1 +
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 1ee603d..f96048a 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -637,8 +637,8 @@ static void arch_dump_stack(void *info)
if (regs)
show_regs(regs);
-
- dump_stack();
+ else
+ dump_stack();
}
void arch_trigger_all_cpu_backtrace(bool include_self)
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 31ca2ed..1b90121 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -344,6 +344,7 @@ static void __show_regs(const struct pt_regs *regs)
void show_regs(struct pt_regs *regs)
{
__show_regs((struct pt_regs *)regs);
+ dump_stack();
}
void show_registers(struct pt_regs *regs)
--
2.7.0
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From dea39aca1d7aef1e2b95b07edeacf04cc8863a2e Mon Sep 17 00:00:00 2001
From: Stefan Wahren <stefan.wahren(a)i2se.com>
Date: Sun, 15 Jul 2018 21:53:20 +0200
Subject: [PATCH] net: lan78xx: Fix race in tx pending skb size calculation
The skb size calculation in lan78xx_tx_bh is in race with the start_xmit,
which could lead to rare kernel oopses. So protect the whole skb walk with
a spin lock. As a benefit we can unlink the skb directly.
This patch was tested on Raspberry Pi 3B+
Link: https://github.com/raspberrypi/linux/issues/2608
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Floris Bos <bos(a)je-eigen-domein.nl>
Signed-off-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 2e4130746c40..ed10d49eb5e0 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -3344,6 +3344,7 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
pkt_cnt = 0;
count = 0;
length = 0;
+ spin_lock_irqsave(&tqp->lock, flags);
for (skb = tqp->next; pkt_cnt < tqp->qlen; skb = skb->next) {
if (skb_is_gso(skb)) {
if (pkt_cnt) {
@@ -3352,7 +3353,8 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
}
count = 1;
length = skb->len - TX_OVERHEAD;
- skb2 = skb_dequeue(tqp);
+ __skb_unlink(skb, tqp);
+ spin_unlock_irqrestore(&tqp->lock, flags);
goto gso_skb;
}
@@ -3361,6 +3363,7 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
skb_totallen = skb->len + roundup(skb_totallen, sizeof(u32));
pkt_cnt++;
}
+ spin_unlock_irqrestore(&tqp->lock, flags);
/* copy to a single skb */
skb = alloc_skb(skb_totallen, GFP_ATOMIC);
I'm announcing the release of the 4.4.142 kernel.
It's not an "essencial" upgrade, but a number of build problems with
perf are now resolved, and an x86 issue that some people might have hit
is now handled properly. If those were problems for you, please
upgrade.
The updated 4.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.4.y
and can be browsed at the normal kernel.org git web browser:
http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/common.c | 7 ++++---
scripts/Kbuild.include | 5 +++--
tools/arch/x86/include/asm/unistd_32.h | 9 +++++++++
tools/arch/x86/include/asm/unistd_64.h | 9 +++++++++
tools/build/Build.include | 5 +++--
tools/perf/config/Makefile | 1 +
tools/perf/perf-sys.h | 18 ------------------
tools/perf/util/include/asm/unistd_32.h | 1 -
tools/perf/util/include/asm/unistd_64.h | 1 -
tools/scripts/Makefile.include | 2 ++
11 files changed, 32 insertions(+), 28 deletions(-)
Andy Lutomirski (1):
x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
Arnaldo Carvalho de Melo (1):
perf tools: Move syscall number fallbacks from perf-sys.h to tools/arch/x86/include/asm/
Greg Kroah-Hartman (1):
Linux 4.4.142
Rasmus Villemoes (1):
Kbuild: fix # escaping in .cmd files for future Make
This is the start of the stable review cycle for the 4.4.142 release.
There are 3 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri Jul 20 14:51:50 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.142-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.142-rc1
Arnaldo Carvalho de Melo <acme(a)redhat.com>
perf tools: Move syscall number fallbacks from perf-sys.h to tools/arch/x86/include/asm/
Andy Lutomirski <luto(a)kernel.org>
x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Kbuild: fix # escaping in .cmd files for future Make
-------------
Diffstat:
Makefile | 4 ++--
arch/x86/kernel/cpu/common.c | 7 ++++---
scripts/Kbuild.include | 5 +++--
tools/arch/x86/include/asm/unistd_32.h | 9 +++++++++
tools/arch/x86/include/asm/unistd_64.h | 9 +++++++++
tools/build/Build.include | 5 +++--
tools/perf/config/Makefile | 1 +
tools/perf/perf-sys.h | 18 ------------------
tools/perf/util/include/asm/unistd_32.h | 1 -
tools/perf/util/include/asm/unistd_64.h | 1 -
tools/scripts/Makefile.include | 2 ++
11 files changed, 33 insertions(+), 29 deletions(-)
Every time I tried to upgrade my laptop from 3.10.x to 4.x I faced an
issue by which the fan would run at full speed upon resume. Bisecting
it showed me the issue was introduced in 3.17 by commit 821d6f0359b0
(ACPI / sleep: Do not save NVS for new machines to accelerate S3). This
code only affects machines built starting as of 2012, but this Asus
1025C laptop was made in 2012 and apparently needs the NVS data to be
saved, otherwise the CPU's thermal state is not properly reported on
resume and the fan runs at full speed upon resume.
Here's a very simple way to check if such a machine is affected :
# cat /sys/class/thermal/thermal_zone0/temp
55000
( now suspend, wait one second and resume )
# cat /sys/class/thermal/thermal_zone0/temp
0
(and after ~15 seconds the fan starts to spin)
Let's apply the same quirk as commit cbc00c13 (ACPI: save NVS memory
for Lenovo G50-45) and reuse the function it provides. Note that this
commit was already backported to 4.9.x but not 4.4.x.
Cc: <linux-acpi(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org> # 3.17+: requires cbc00c13
Signed-off-by: Willy Tarreau <w(a)1wt.eu>
---
drivers/acpi/sleep.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 974e584..af54d7b 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -338,6 +338,14 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
DMI_MATCH(DMI_PRODUCT_NAME, "K54HR"),
},
},
+ {
+ .callback = init_nvs_save_s3,
+ .ident = "Asus 1025C",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "1025C"),
+ },
+ },
/*
* https://bugzilla.kernel.org/show_bug.cgi?id=189431
* Lenovo G50-45 is a platform later than 2012, but needs nvs memory
--
2.8.0.rc2.1.gbe9624a
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 733d3e5497070d05971352ca5087bac83c197c3d Mon Sep 17 00:00:00 2001
From: Or Gerlitz <ogerlitz(a)mellanox.com>
Date: Thu, 31 May 2018 11:32:56 +0300
Subject: [PATCH] net/mlx5e: Avoid dealing with vport representors if not being
e-switch manager
In smartnic env, the host (PF) driver might not be an e-switch
manager, hence the switchdev mode representors are running on
the embedded cpu (EC) and not at the host.
As such, we should avoid dealing with vport representors if
not being esw manager.
While here, make sure to disallow eswitch switchdev related
setups through devlink if we are not esw managers.
Fixes: cb67b832921c ('net/mlx5e: Introduce SRIOV VF representors')
Signed-off-by: Or Gerlitz <ogerlitz(a)mellanox.com>
Reviewed-by: Eli Cohen <eli(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 56c1b6f5593e..dae4156a710d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2846,7 +2846,7 @@ void mlx5e_activate_priv_channels(struct mlx5e_priv *priv)
mlx5e_activate_channels(&priv->channels);
netif_tx_start_all_queues(priv->netdev);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_add_sqs_fwd_rules(priv);
mlx5e_wait_channels_min_rx_wqes(&priv->channels);
@@ -2857,7 +2857,7 @@ void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv)
{
mlx5e_redirect_rqts_to_drop(priv);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_remove_sqs_fwd_rules(priv);
/* FIXME: This is a W/A only for tx timeout watch dog false alarm when
@@ -4597,7 +4597,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
mlx5e_set_netdev_dev_addr(netdev);
#if IS_ENABLED(CONFIG_MLX5_ESWITCH)
- if (MLX5_VPORT_MANAGER(mdev))
+ if (MLX5_ESWITCH_MANAGER(mdev))
netdev->switchdev_ops = &mlx5e_switchdev_ops;
#endif
@@ -4753,7 +4753,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv)
mlx5e_enable_async_events(priv);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_register_vport_reps(priv);
if (netdev->reg_state != NETREG_REGISTERED)
@@ -4788,7 +4788,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv)
queue_work(priv->wq, &priv->set_rx_mode_work);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_unregister_vport_reps(priv);
mlx5e_disable_async_events(priv);
@@ -4972,7 +4972,7 @@ static void *mlx5e_add(struct mlx5_core_dev *mdev)
return NULL;
#ifdef CONFIG_MLX5_ESWITCH
- if (MLX5_VPORT_MANAGER(mdev)) {
+ if (MLX5_ESWITCH_MANAGER(mdev)) {
rpriv = mlx5e_alloc_nic_rep_priv(mdev);
if (!rpriv) {
mlx5_core_warn(mdev, "Failed to alloc NIC rep priv data\n");
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 60236f73373c..2b8040a3cdbd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -823,7 +823,7 @@ bool mlx5e_is_uplink_rep(struct mlx5e_priv *priv)
struct mlx5e_rep_priv *rpriv = priv->ppriv;
struct mlx5_eswitch_rep *rep;
- if (!MLX5_CAP_GEN(priv->mdev, vport_group_manager))
+ if (!MLX5_ESWITCH_MANAGER(priv->mdev))
return false;
rep = rpriv->rep;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index cecd201f0b73..91f1209886ff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1079,8 +1079,8 @@ static int mlx5_devlink_eswitch_check(struct devlink *devlink)
if (MLX5_CAP_GEN(dev, port_type) != MLX5_CAP_PORT_TYPE_ETH)
return -EOPNOTSUPP;
- if (!MLX5_CAP_GEN(dev, vport_group_manager))
- return -EOPNOTSUPP;
+ if(!MLX5_ESWITCH_MANAGER(dev))
+ return -EPERM;
if (dev->priv.eswitch->mode == SRIOV_NONE)
return -EOPNOTSUPP;
====================
READ ME. I have this stale email in my outgoing draft folder, and I
have no idea if I actually sent this out successfully or not.
Please double check, thanks!
====================
Please queue up the following networking bug fixes for v4.14 and v4.17
-stable, repectively.
Thank you!
From: "Gautham R. Shenoy" <ego(a)linux.vnet.ibm.com>
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.
SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored. As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 and above.
Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.
Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.
Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs
for stop4 cpuidle")
Reported-by: Florian Weimer <fweimer(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # 4.14
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Michael Neuling <mikey(a)neuling.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Vaidyanathan Srinivasan <svaidy(a)linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego(a)linux.vnet.ibm.com>
---
Change from v1:
Restoring the SPRG3 from paca->sprg_vdso instead of saving
it separately during stop-entry, as suggested by Mikey.
arch/powerpc/kernel/idle_book3s.S | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index d85d551..672ead8 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -144,7 +144,9 @@ power9_restore_additional_sprs:
mtspr SPRN_MMCR1, r4
ld r3, STOP_MMCR2(r13)
+ ld r4, PACA_SPRG_VDSO(r13)
mtspr SPRN_MMCR2, r3
+ mtspr SPRN_SPRG3, r4
blr
/*
--
1.8.3.1
chip->read_buf is left unassigned since commit 4da712e70294 ("mtd: nand:
fsmc: use ->exec_op()"), leading to a NULL pointer dereference when it's
called from fsmc_read_page_hwecc(). Fix that by using the appropriate
helper to read data out of the NAND.
Fixes: 4da712e70294 ("mtd: nand: fsmc: use ->exec_op()")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
---
drivers/mtd/nand/raw/fsmc_nand.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/fsmc_nand.c b/drivers/mtd/nand/raw/fsmc_nand.c
index f4a5a317d4ae..e1086a010b88 100644
--- a/drivers/mtd/nand/raw/fsmc_nand.c
+++ b/drivers/mtd/nand/raw/fsmc_nand.c
@@ -740,7 +740,7 @@ static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
for (i = 0, s = 0; s < eccsteps; s++, i += eccbytes, p += eccsize) {
nand_read_page_op(chip, page, s * eccsize, NULL, 0);
chip->ecc.hwctl(mtd, NAND_ECC_READ);
- chip->read_buf(mtd, p, eccsize);
+ nand_read_data_op(chip, p, eccsize, false);
for (j = 0; j < eccbytes;) {
struct mtd_oob_region oobregion;
--
2.14.1
For nouveau, while the GPU is guaranteed to be on when a hotplug has
been received, the same assertion does not hold true if a connector
probe has been started by userspace without having had received a sysfs
event.
So ensure that any connector probing keeps the GPU alive for the
duration of the probe by introducing
drm_helper_probe_single_connector_modes_with_rpm(). It's the same as
drm_helper_probe_single_connector_modes, but it handles holding a power
reference to the device for the duration of the connector probe.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Karol Herbst <karolherbst(a)gmail.com>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: stable(a)vger.kernel.org
---
Changes since v1:
- Add a generic helper to DRM to handle this
drivers/gpu/drm/drm_probe_helper.c | 31 +++++++++++++++++++++
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_connector.c | 4 +--
include/drm/drm_crtc_helper.h | 7 +++--
4 files changed, 38 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
index 527743394150..0a9d6748b854 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -31,6 +31,7 @@
#include <linux/export.h>
#include <linux/moduleparam.h>
+#include <linux/pm_runtime.h>
#include <drm/drmP.h>
#include <drm/drm_crtc.h>
@@ -541,6 +542,36 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector,
}
EXPORT_SYMBOL(drm_helper_probe_single_connector_modes);
+/**
+ * drm_helper_probe_single_connector_modes_with_rpm - get complete set of
+ * display modes
+ * @connector: connector to probe
+ * @maxX: max width for modes
+ * @maxY: max height for modes
+ *
+ * Same as drm_helper_probe_single_connector_modes, except that it makes sure
+ * that the device is active by synchronously grabbing a runtime power
+ * reference while probing.
+ *
+ * Returns:
+ * The number of modes found on @connector.
+ */
+int drm_helper_probe_single_connector_modes_with_rpm(struct drm_connector *connector,
+ uint32_t maxX, uint32_t maxY)
+{
+ int ret;
+
+ ret = pm_runtime_get_sync(connector->dev->dev);
+ if (ret < 0 && ret != -EACCES)
+ return ret;
+
+ ret = drm_helper_probe_single_connector_modes(connector, maxX, maxY);
+
+ pm_runtime_put(connector->dev->dev);
+ return ret;
+}
+EXPORT_SYMBOL(drm_helper_probe_single_connector_modes_with_rpm);
+
/**
* drm_kms_helper_hotplug_event - fire off KMS hotplug events
* @dev: drm_device whose connector state changed
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index fa3ab618a0f9..c54767b50fd8 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -858,7 +858,7 @@ static const struct drm_connector_funcs
nv50_mstc = {
.reset = nouveau_conn_reset,
.detect = nv50_mstc_detect,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes_with_rpm,
.destroy = nv50_mstc_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
.atomic_destroy_state = nouveau_conn_atomic_destroy_state,
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 2a45b4c2ceb0..8d9070779261 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -1088,7 +1088,7 @@ nouveau_connector_funcs = {
.reset = nouveau_conn_reset,
.detect = nouveau_connector_detect,
.force = nouveau_connector_force,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes_with_rpm,
.set_property = nouveau_connector_set_property,
.destroy = nouveau_connector_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
@@ -1103,7 +1103,7 @@ nouveau_connector_funcs_lvds = {
.reset = nouveau_conn_reset,
.detect = nouveau_connector_detect_lvds,
.force = nouveau_connector_force,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes_with_rpm,
.set_property = nouveau_connector_set_property,
.destroy = nouveau_connector_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
diff --git a/include/drm/drm_crtc_helper.h b/include/drm/drm_crtc_helper.h
index 6914633037a5..8f3f6d6fcc8c 100644
--- a/include/drm/drm_crtc_helper.h
+++ b/include/drm/drm_crtc_helper.h
@@ -64,9 +64,10 @@ int drm_helper_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
struct drm_framebuffer *old_fb);
/* drm_probe_helper.c */
-int drm_helper_probe_single_connector_modes(struct drm_connector
- *connector, uint32_t maxX,
- uint32_t maxY);
+int drm_helper_probe_single_connector_modes(struct drm_connector *connector,
+ uint32_t maxX, uint32_t maxY);
+int drm_helper_probe_single_connector_modes_with_rpm(struct drm_connector *connector,
+ uint32_t maxX, uint32_t maxY);
int drm_helper_probe_detect(struct drm_connector *connector,
struct drm_modeset_acquire_ctx *ctx,
bool force);
--
2.17.1
When DP MST hubs get confused, they can occasionally stop responding for
a good bit of time up until the point where the DRM driver manages to
do the right DPCD accesses to get it to start responding again. In a
worst case scenario however, this process can take upwards of 10+
seconds.
Currently we use the default output_poll_changed handler
drm_fb_helper_output_poll_changed() to handle output polling, which
doesn't happen to grab any power references on the device when polling.
If we're unlucky enough to have a hub (such as Lenovo's infamous laptop
docks for the P5x/P7x series) that's easily startled and confused, this
can lead to a pretty nasty deadlock situation that looks like this:
- Hotplug event from hub happens, we enter
drm_fb_helper_output_poll_changed() and start communicating with the
hub
- While we're in drm_fb_helper_output_poll_changed() and attempting to
communicate with the hub, we end up confusing it and cause it to stop
responding for at least 10 seconds
- After 5 seconds of being in drm_fb_helper_output_poll_changed(), the
pm core attempts to put the GPU into autosuspend, which ends up
calling drm_kms_helper_poll_disable()
- While the runtime PM core is waiting in drm_kms_helper_poll_disable()
for the output poll to finish, we end up finally detecting an MST
display
- We notice the new display and tries to enable it, which triggers
an atomic commit which triggers a call to pm_runtime_get_sync()
- the output poll thread deadlocks the pm core waiting for the pm core
to finish the autosuspend request while the pm core waits for the
output poll thread to finish
Sample:
[ 246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
[ 246.673398] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.676527] kworker/4:0 D 0 37 2 0x80000000
[ 246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
[ 246.678704] Call Trace:
[ 246.679753] __schedule+0x322/0xaf0
[ 246.680916] schedule+0x33/0x90
[ 246.681924] schedule_preempt_disabled+0x15/0x20
[ 246.683023] __mutex_lock+0x569/0x9a0
[ 246.684035] ? kobject_uevent_env+0x117/0x7b0
[ 246.685132] ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.686179] mutex_lock_nested+0x1b/0x20
[ 246.687278] ? mutex_lock_nested+0x1b/0x20
[ 246.688307] drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.689420] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.690462] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.691570] output_poll_execute+0x198/0x1c0 [drm_kms_helper]
[ 246.692611] process_one_work+0x231/0x620
[ 246.693725] worker_thread+0x214/0x3a0
[ 246.694756] kthread+0x12b/0x150
[ 246.695856] ? wq_pool_ids_show+0x140/0x140
[ 246.696888] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.697998] ret_from_fork+0x3a/0x50
[ 246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
[ 246.700153] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.702278] kworker/0:1 D 0 60 2 0x80000000
[ 246.703293] Workqueue: pm pm_runtime_work
[ 246.704393] Call Trace:
[ 246.705403] __schedule+0x322/0xaf0
[ 246.706439] ? wait_for_completion+0x104/0x190
[ 246.707393] schedule+0x33/0x90
[ 246.708375] schedule_timeout+0x3a5/0x590
[ 246.709289] ? mark_held_locks+0x58/0x80
[ 246.710208] ? _raw_spin_unlock_irq+0x2c/0x40
[ 246.711222] ? wait_for_completion+0x104/0x190
[ 246.712134] ? trace_hardirqs_on_caller+0xf4/0x190
[ 246.713094] ? wait_for_completion+0x104/0x190
[ 246.713964] wait_for_completion+0x12c/0x190
[ 246.714895] ? wake_up_q+0x80/0x80
[ 246.715727] ? get_work_pool+0x90/0x90
[ 246.716649] flush_work+0x1c9/0x280
[ 246.717483] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[ 246.718442] __cancel_work_timer+0x146/0x1d0
[ 246.719247] cancel_delayed_work_sync+0x13/0x20
[ 246.720043] drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
[ 246.721123] nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
[ 246.721897] pci_pm_runtime_suspend+0x6b/0x190
[ 246.722825] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.723737] __rpm_callback+0x7a/0x1d0
[ 246.724721] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.725607] rpm_callback+0x24/0x80
[ 246.726553] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.727376] rpm_suspend+0x142/0x6b0
[ 246.728185] pm_runtime_work+0x97/0xc0
[ 246.728938] process_one_work+0x231/0x620
[ 246.729796] worker_thread+0x44/0x3a0
[ 246.730614] kthread+0x12b/0x150
[ 246.731395] ? wq_pool_ids_show+0x140/0x140
[ 246.732202] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.732878] ret_from_fork+0x3a/0x50
[ 246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
[ 246.734587] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.736113] kworker/4:2 D 0 422 2 0x80000080
[ 246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[ 246.737665] Call Trace:
[ 246.738490] __schedule+0x322/0xaf0
[ 246.739250] schedule+0x33/0x90
[ 246.739908] rpm_resume+0x19c/0x850
[ 246.740750] ? finish_wait+0x90/0x90
[ 246.741541] __pm_runtime_resume+0x4e/0x90
[ 246.742370] nv50_disp_atomic_commit+0x31/0x210 [nouveau]
[ 246.743124] drm_atomic_commit+0x4a/0x50 [drm]
[ 246.743775] restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
[ 246.744603] restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
[ 246.745373] drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
[ 246.746220] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
[ 246.746884] drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
[ 246.747675] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.748544] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.749439] nv50_mstm_hotplug+0x15/0x20 [nouveau]
[ 246.750111] drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
[ 246.750764] drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
[ 246.751602] drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
[ 246.752314] process_one_work+0x231/0x620
[ 246.752979] worker_thread+0x44/0x3a0
[ 246.753838] kthread+0x12b/0x150
[ 246.754619] ? wq_pool_ids_show+0x140/0x140
[ 246.755386] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.756162] ret_from_fork+0x3a/0x50
[ 246.756847]
Showing all locks held in the system:
[ 246.758261] 3 locks held by kworker/4:0/37:
[ 246.759016] #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.759856] #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.760670] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.761516] 2 locks held by kworker/0:1/60:
[ 246.762274] #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.762982] #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.763890] 1 lock held by khungtaskd/64:
[ 246.764664] #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[ 246.765588] 5 locks held by kworker/4:2/422:
[ 246.766440] #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.767390] #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.768154] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
[ 246.768966] #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
[ 246.769921] #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
[ 246.770839] 1 lock held by dmesg/1038:
[ 246.771739] 2 locks held by zsh/1172:
[ 246.772650] #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[ 246.773680] #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[ 246.775522] =============================================
So, to fix this (and any other possible deadlock issues like this that
could occur in the output_poll_changed patch) we make sure that
nouveau's output_poll_changed functions grab a runtime power ref before
sending any hotplug events, and hold it until we're finished. We
introduce this through adding a generic DRM helper which other drivers
may reuse.
This fixes deadlock issues when in fbcon with nouveau on my P50, and
should fix it for everyone else's as well!
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Karol Herbst <karolherbst(a)gmail.com>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: stable(a)vger.kernel.org
---
Changes since v1:
- Add a generic helper for DRM to handle this
drivers/gpu/drm/drm_fb_helper.c | 23 +++++++++++++++++++++++
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
include/drm/drm_fb_helper.h | 5 +++++
3 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 2ee1eaa66188..1ab2f3646526 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -34,6 +34,7 @@
#include <linux/sysrq.h>
#include <linux/slab.h>
#include <linux/module.h>
+#include <linux/pm_runtime.h>
#include <drm/drmP.h>
#include <drm/drm_crtc.h>
#include <drm/drm_fb_helper.h>
@@ -2928,6 +2929,28 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev)
}
EXPORT_SYMBOL(drm_fb_helper_output_poll_changed);
+/**
+ * drm_fb_helper_output_poll_changed_with_rpm - DRM mode
+ * config \.output_poll_changed
+ * helper for fbdev emulation
+ * @dev: DRM device
+ *
+ * Same as drm_fb_helper_output_poll_changed, except that it makes sure that
+ * the device is active by synchronously grabbing a runtime power reference
+ * while probing.
+ */
+void drm_fb_helper_output_poll_changed_with_rpm(struct drm_device *dev)
+{
+ int ret;
+
+ ret = pm_runtime_get_sync(dev->dev);
+ if (WARN_ON(ret < 0 && ret != -EACCES))
+ return;
+ drm_fb_helper_hotplug_event(dev->fb_helper);
+ pm_runtime_put(dev->dev);
+}
+EXPORT_SYMBOL(drm_fb_helper_output_poll_changed_with_rpm);
+
/* The Kconfig DRM_KMS_HELPER selects FRAMEBUFFER_CONSOLE (if !EXPERT)
* but the module doesn't depend on any fb console symbols. At least
* attempt to load fbcon to avoid leaving the system without a usable console.
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index eb3e41a78806..fa3ab618a0f9 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -2015,7 +2015,7 @@ nv50_disp_atomic_state_alloc(struct drm_device *dev)
static const struct drm_mode_config_funcs
nv50_disp_func = {
.fb_create = nouveau_user_framebuffer_create,
- .output_poll_changed = drm_fb_helper_output_poll_changed,
+ .output_poll_changed = drm_fb_helper_output_poll_changed_with_rpm,
.atomic_check = nv50_disp_atomic_check,
.atomic_commit = nv50_disp_atomic_commit,
.atomic_state_alloc = nv50_disp_atomic_state_alloc,
diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
index b069433e7fc1..ca809bfbaebb 100644
--- a/include/drm/drm_fb_helper.h
+++ b/include/drm/drm_fb_helper.h
@@ -330,6 +330,7 @@ void drm_fb_helper_fbdev_teardown(struct drm_device *dev);
void drm_fb_helper_lastclose(struct drm_device *dev);
void drm_fb_helper_output_poll_changed(struct drm_device *dev);
+void drm_fb_helper_output_poll_changed_with_rpm(struct drm_device *dev);
#else
static inline void drm_fb_helper_prepare(struct drm_device *dev,
struct drm_fb_helper *helper,
@@ -564,6 +565,10 @@ static inline void drm_fb_helper_output_poll_changed(struct drm_device *dev)
{
}
+static inline void drm_fb_helper_output_poll_changed_with_rpm(struct drm_device *dev)
+{
+}
+
#endif
static inline int
--
2.17.1
Dear Beneficiary) stable(a)vger.kernel.org,,
Good day!
I have important information for you here.
Please accept my sincere apologizes if my email disturbs your business or
personal ethics and let me first introduce myself to you before telling you my reason for contacting you My name is Mr. Jim Boke Lawson from Sierra Leon and a staff in the Card Accounts Management Section of a very
well-known bank, here at London United Kingdom.
One of our bank accounts here has a current deposit balance of (Nine
Million Three Hundred and Forty Six Thousand, One Hundred and Fifty two
Great British Pounds Sterling and Point Seven Pence Only) and the Card
account has been classified AS DORMANT (Lack of activities on the account
(deposited or withdrawal) for last operated (11yrs) now.
>From my private investigations and confirmation, the owner of this said
bank account, a Foreigner by name, Mr. Jonce Ludwig died on the 4th of
January 2007 in a boat accident that occurred here at Birmingham, UK
Since then, nobody has done anything as regards claiming this money as no
traces of family member or next of kin with any knowledge as to the
existence of this bank account or the funds here.
Further inquiry from our National Immigration Services here also indicated
that he (Mr. Jonce Ludwig) was ONLY on a single entry into the UK, and had
had intention to establish a Real Estate Firm here at United Kingdom, hence the deposit of this very fund with our bank.
I have been in search for a reliable and trusted foreign partner to work
with and transfer this money to you that is why I am contacting you to seek your consent to partner with me and get this funds out to you urgently.
This is totally free and 100% risk free because I am the bank officer in
charge of this particular account and it’s my duty at the bank to sign off
the deposit to whoever I present as the administrator. I therefore propose
to do this business with you, and I want you to stand in as the Next of Kin to attest to this fund, so that we can have the fund and Bank Account
details released to you after due processes have been followed.
Also, all the legal documents are readily available and valid to be signed
and stamped in your name and it will be done without any problem as the
fund is 100% legitimate and does not originate from drug money laundering,
or terrorism fund or any other illegal act.
As soon as the deal is done successfully, you will take 50% while 50% come
to me.
Please let me know your interest to work with me by responding to this
through my private email address as follows:
Thanking you for accepting to do this with us.
I remain sincerely,
Mr. Jimy Boke Lawson.
Hi Greg,
For your consideration, stable commits picked up from lede source
tree https://git.lede-project.org/?p=source.git for 4.9.y.
Cherry-picked and build tested on Linux v4.9.112.
Regards,
Amit Pundir
Heiner Kallweit (1):
mtd: m25p80: consider max message size in m25p80_read
Jonas Gorski (4):
spi/bcm63xx: make spi subsystem aware of message size limits
spi/bcm63xx: fix typo in bcm63xx_spi_max_length breaking compilation
bcm63xx_enet: correct clock usage
bcm63xx_enet: do not write to random DMA channel on BCM6345
drivers/mtd/devices/m25p80.c | 3 ++-
drivers/net/ethernet/broadcom/bcm63xx_enet.c | 34 ++++++++++++++++++++--------
drivers/spi/spi-bcm63xx.c | 9 ++++++++
3 files changed, 36 insertions(+), 10 deletions(-)
--
2.7.4
This is the start of the stable review cycle for the 4.17.8 release.
There are 1 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu Jul 19 11:47:15 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.17.8-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.17.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.17.8-rc1
Pavel Tatashin <pasha.tatashin(a)oracle.com>
mm: don't do zero_resv_unavail if memmap is not allocated
-------------
Diffstat:
Makefile | 4 ++--
include/linux/mm.h | 2 +-
mm/page_alloc.c | 4 ++--
3 files changed, 5 insertions(+), 5 deletions(-)
Hi Greg,
For your consideration, stable commits picked up from lede source
tree https://git.lede-project.org/?p=source.git for 4.14.y.
Cherry-picked and build tested on Linux v4.14.55.
Regards,
Amit Pundir
Christian Lamparter (2):
crypto: crypto4xx - remove bad list_del
crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak
Jaehoon Chung (1):
PCI: exynos: Fix a potential init_clk_resources NULL pointer
dereference
Jonas Gorski (2):
bcm63xx_enet: correct clock usage
bcm63xx_enet: do not write to random DMA channel on BCM6345
drivers/crypto/amcc/crypto4xx_core.c | 23 +++++++++----------
drivers/net/ethernet/broadcom/bcm63xx_enet.c | 34 ++++++++++++++++++++--------
drivers/pci/dwc/pci-exynos.c | 3 ++-
3 files changed, 38 insertions(+), 22 deletions(-)
--
2.7.4
Hi Greg,
Please apply 3e4c56d41eef5595035872a2ec5a483f42e8917f ("ocfs2:
ip_alloc_sem should be taken in ocfs2_get_block()") to v4.9.y to fix
CVE-2017-18224.
Regards,
Salvatore
Hi Greg,
Please apply commit 853bc26a7ea39e354b9f8889ae7ad1492ffa28d2 ("ocfs2:
subsystem.su_mutex is required while accessing the item->ci_parent")
to v4.9.y to address CVE-2017-18216.
It has been already applied for instance to 3.16.57
(d9b4d618a22bf30a1c82dffc5c7cb3b1abda48dc) and 3.2.102
(dfd9f20a2db71ca01033040ecf69d5c0e67db629).
Regards,
Salvatore
Please backport the
f4eb17e1efe538d4da7d574bedb00a8dafcc26b7 ("Revert "sit: reload iphdr
in ipip6_rcv"") to 4.4.y.
It has been applied to 4.9.112 recently, but in 4.4.y the problem still exists.
The original commit was CCed to 'stable' but the reversal wasn't.
It causes a regression with non-working ping6 and log spam from sit.
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ae6efcae79dd2888243634b69fce51208b650192 Mon Sep 17 00:00:00 2001
From: Sean Wang <sean.wang(a)mediatek.com>
Date: Fri, 22 Jun 2018 11:49:06 +0800
Subject: [PATCH] pinctrl: mt7622: fix that pinctrl_claim_hogs cannot work
To allow claiming hogs by pinctrl, we cannot enable pinctrl until all
groups and functions are being added done. Also, it's necessary that
the corresponding gpiochip is being added when the pinctrl device is
enabled.
Cc: stable(a)vger.kernel.org
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang(a)mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/mediatek/pinctrl-mt7622.c b/drivers/pinctrl/mediatek/pinctrl-mt7622.c
index e9eba62da233..42155d4e7f1b 100644
--- a/drivers/pinctrl/mediatek/pinctrl-mt7622.c
+++ b/drivers/pinctrl/mediatek/pinctrl-mt7622.c
@@ -1695,9 +1695,10 @@ static int mtk_pinctrl_probe(struct platform_device *pdev)
mtk_desc.custom_conf_items = mtk_conf_items;
#endif
- hw->pctrl = devm_pinctrl_register(&pdev->dev, &mtk_desc, hw);
- if (IS_ERR(hw->pctrl))
- return PTR_ERR(hw->pctrl);
+ err = devm_pinctrl_register_and_init(&pdev->dev, &mtk_desc, hw,
+ &hw->pctrl);
+ if (err)
+ return err;
/* Setup groups descriptions per SoC types */
err = mtk_build_groups(hw);
@@ -1713,11 +1714,19 @@ static int mtk_pinctrl_probe(struct platform_device *pdev)
return err;
}
+ /* For able to make pinctrl_claim_hogs, we must not enable pinctrl
+ * until all groups and functions are being added one.
+ */
+ err = pinctrl_enable(hw->pctrl);
+ if (err)
+ return err;
+
err = mtk_build_eint(hw, pdev);
if (err)
dev_warn(&pdev->dev,
"Failed to add EINT, but pinctrl still can work\n");
+ /* Build gpiochip should be after pinctrl_enable is done */
err = mtk_build_gpiochip(hw, pdev->dev.of_node);
if (err) {
dev_err(&pdev->dev, "Failed to add gpio_chip\n");
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8875059d2165f22236e87ed10188b0e18f116b93 Mon Sep 17 00:00:00 2001
From: Sean Wang <sean.wang(a)mediatek.com>
Date: Fri, 22 Jun 2018 11:49:05 +0800
Subject: [PATCH] pinctrl: mt7622: fix initialization sequence between eint and
gpiochip
Because gpichip applied in the driver must depend on mtk eint to implement
the input data debouncing and the translation between gpio and irq, it's
better to keep logic consistent with mtk eint being built prior to gpiochip
being added.
Cc: stable(a)vger.kernel.org
Fixes: e6dabd38d8e7 ("pinctrl: mediatek: add EINT support to MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang(a)mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/mediatek/pinctrl-mt7622.c b/drivers/pinctrl/mediatek/pinctrl-mt7622.c
index 9ad8cb7730d3..e9eba62da233 100644
--- a/drivers/pinctrl/mediatek/pinctrl-mt7622.c
+++ b/drivers/pinctrl/mediatek/pinctrl-mt7622.c
@@ -1713,17 +1713,17 @@ static int mtk_pinctrl_probe(struct platform_device *pdev)
return err;
}
+ err = mtk_build_eint(hw, pdev);
+ if (err)
+ dev_warn(&pdev->dev,
+ "Failed to add EINT, but pinctrl still can work\n");
+
err = mtk_build_gpiochip(hw, pdev->dev.of_node);
if (err) {
dev_err(&pdev->dev, "Failed to add gpio_chip\n");
return err;
}
- err = mtk_build_eint(hw, pdev);
- if (err)
- dev_warn(&pdev->dev,
- "Failed to add EINT, but pinctrl still can work\n");
-
platform_set_drvdata(pdev, hw);
return 0;
From: Paul Burton <paul.burton(a)mips.com>
The current MIPS implementation of arch_trigger_cpumask_backtrace() is
broken because it attempts to use synchronous IPIs despite the fact that
it may be run with interrupts disabled.
This means that when arch_trigger_cpumask_backtrace() is invoked, for
example by the RCU CPU stall watchdog, we may:
- Deadlock due to use of synchronous IPIs with interrupts disabled,
causing the CPU that's attempting to generate the backtrace output
to hang itself.
- Not succeed in generating the desired output from remote CPUs.
- Produce warnings about this from smp_call_function_many(), for
example:
[42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks:
[42760.535755] 0-...!: (1 GPs behind) idle=ade/140000000000000/0 softirq=526944/526945 fqs=0
[42760.547874] 1-...!: (0 ticks this GP) idle=e4a/140000000000000/0 softirq=547885/547885 fqs=0
[42760.559869] (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33)
[42760.568927] ------------[ cut here ]------------
[42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 smp_call_function_many+0x88/0x20c
[42760.587839] Modules linked in:
[42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 4.15.4-00373-gee058bb4d0c2 #2
[42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fffffff0 00000007 00000006 00000000 8e09bca8
[42760.616937] 95b2b379 95b2b379 807a0080 00000007 81944518 0000018a 00000032 00000000
[42760.630095] 00000000 00000030 80000000 00000000 806eca74 00000009 8017e2b8 000001a0
[42760.643169] 00000000 00000002 00000000 8e09baa4 00000008 808b8008 86d69080 8e09bca0
[42760.656282] 8e09ad50 805e20aa 00000000 00000000 00000000 8017e2b8 00000009 801070ca
[42760.669424] ...
[42760.673919] Call Trace:
[42760.678672] [<27fde568>] show_stack+0x70/0xf0
[42760.685417] [<84751641>] dump_stack+0xaa/0xd0
[42760.692188] [<699d671c>] __warn+0x80/0x92
[42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36
[42760.705912] [<f7c76c1c>] smp_call_function_many+0x88/0x20c
[42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a
[42760.722216] [<f845bd33>] rcu_dump_cpu_stacks+0x6a/0x98
[42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac
[42760.737476] [<059b3b43>] update_process_times+0x18/0x34
[42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38
[42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50
[42760.759882] [<e56ea39f>] __hrtimer_run_queues+0xc6/0x226
[42760.767418] [<e88bbcae>] hrtimer_interrupt+0x88/0x19a
[42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a
[42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168
[42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86
[42760.805545] [<b2ada1c7>] gic_irq_dispatch+0xa/0x14
[42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.820086] [<c7521934>] do_IRQ+0x16/0x20
[42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94
[42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78
[42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c
[42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c
[42760.855693] [<ab9fc705>] flush_tlb_mm+0x2a/0x98
[42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44
[42760.869628] [<cb259b74>] arch_tlb_finish_mmu+0x26/0x3e
[42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66
[42760.883907] [<b3fce717>] exit_mmap+0x76/0xea
[42760.890428] [<c4c8a2f6>] mmput+0x80/0x11a
[42760.896632] [<a41a08f4>] do_exit+0x1f4/0x80c
[42760.903158] [<ee01cef6>] do_group_exit+0x20/0x7e
[42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e
[42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c
[42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c
[42760.931765] ---[ end trace 02aa09da9dc52a60 ]---
[42760.938342] ------------[ cut here ]------------
[42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 smp_call_function_single+0xee/0xf8
...
This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async
IPIs & smp_call_function_single_async() in order to resolve this
problem. We ensure use of the pre-allocated call_single_data_t
structures is serialized by maintaining a cpumask indicating that
they're busy, and refusing to attempt to send an IPI when a CPU's bit is
set in this mask. This should only happen if a CPU hasn't responded to a
previous backtrace IPI - ie. if it's hung - and we print a warning to
the console in this case.
I've marked this for stable branches as far back as v4.9, to which it
applies cleanly. Strictly speaking the faulty MIPS implementation can be
traced further back to commit 856839b76836 ("MIPS: Add
arch_trigger_all_cpu_backtrace() function") in v3.19, but kernel
versions v3.19 through v4.8 will require further work to backport due to
the rework performed in commit 9a01c3ed5cdb ("nmi_backtrace: add more
trigger_*_cpu_backtrace() methods").
Signed-off-by: Paul Burton <paul.burton(a)mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19597/
Cc: James Hogan <jhogan(a)kernel.org>
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: linux-mips(a)linux-mips.org
Cc: stable(a)vger.kernel.org
Fixes: 856839b76836 ("MIPS: Add arch_trigger_all_cpu_backtrace() function")
Fixes: 9a01c3ed5cdb ("nmi_backtrace: add more trigger_*_cpu_backtrace() methods")
[ Huacai: backported to 4.4: Restruction since generic NMI solution is unavailable ]
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/mips/kernel/process.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index f96048a..354b99f 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -629,21 +629,48 @@ unsigned long arch_align_stack(unsigned long sp)
return sp & ALMASK;
}
+static DEFINE_PER_CPU(struct call_single_data, backtrace_csd);
+static struct cpumask backtrace_csd_busy;
+
static void arch_dump_stack(void *info)
{
struct pt_regs *regs;
+ static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
+ arch_spin_lock(&lock);
regs = get_irq_regs();
if (regs)
show_regs(regs);
else
dump_stack();
+ arch_spin_unlock(&lock);
+
+ cpumask_clear_cpu(smp_processor_id(), &backtrace_csd_busy);
}
void arch_trigger_all_cpu_backtrace(bool include_self)
{
- smp_call_function(arch_dump_stack, NULL, 1);
+ struct call_single_data *csd;
+ int cpu;
+
+ for_each_cpu(cpu, cpu_online_mask) {
+ /*
+ * If we previously sent an IPI to the target CPU & it hasn't
+ * cleared its bit in the busy cpumask then it didn't handle
+ * our previous IPI & it's not safe for us to reuse the
+ * call_single_data_t.
+ */
+ if (cpumask_test_and_set_cpu(cpu, &backtrace_csd_busy)) {
+ pr_warn("Unable to send backtrace IPI to CPU%u - perhaps it hung?\n",
+ cpu);
+ continue;
+ }
+
+ csd = &per_cpu(backtrace_csd, cpu);
+ csd->func = arch_dump_stack;
+ smp_call_function_single_async(cpu, csd);
+ }
}
int mips_get_process_fp_mode(struct task_struct *task)
--
2.7.0
The datasheet does not document any registers to control drive strength,
and no drive strength registers are for this reason described for this
SoC. The flags indicating that drive strength can be controlled are
however set for some pins in the driver.
This leads to a NULL pointer dereference when the sh-pfc core tries to
access the struct describing the drive strength registers, for example
when reading the sysfs file pinconf-pins.
Fix this by removing the SH_PFC_PIN_CFG_DRIVE_STRENGTH from all pins
Fixes: b92ac66a1819602b ("pinctrl: sh-pfc: Add R8A77970 PFC support")
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
---
drivers/pinctrl/sh-pfc/pfc-r8a77970.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
---
Hi,
This is a backport of commit 550b6f7e8cf93fc2753aa01e655ed5471012ab5a
from Linusw linux-pinctrl.git tree. It's applicable for v4.16 and v4.17.
This is my first backport patch submission for stable, please let me
know if I can improve the format of this patch commit message or subject
in any way.
diff --git a/drivers/pinctrl/sh-pfc/pfc-r8a77970.c b/drivers/pinctrl/sh-pfc/pfc-r8a77970.c
index b1bb7263532b3bf9..049b374aa4ae4698 100644
--- a/drivers/pinctrl/sh-pfc/pfc-r8a77970.c
+++ b/drivers/pinctrl/sh-pfc/pfc-r8a77970.c
@@ -22,12 +22,12 @@
#include "sh_pfc.h"
#define CPU_ALL_PORT(fn, sfx) \
- PORT_GP_CFG_22(0, fn, sfx, SH_PFC_PIN_CFG_DRIVE_STRENGTH), \
- PORT_GP_CFG_28(1, fn, sfx, SH_PFC_PIN_CFG_DRIVE_STRENGTH), \
- PORT_GP_CFG_17(2, fn, sfx, SH_PFC_PIN_CFG_DRIVE_STRENGTH), \
- PORT_GP_CFG_17(3, fn, sfx, SH_PFC_PIN_CFG_DRIVE_STRENGTH), \
- PORT_GP_CFG_6(4, fn, sfx, SH_PFC_PIN_CFG_DRIVE_STRENGTH), \
- PORT_GP_CFG_15(5, fn, sfx, SH_PFC_PIN_CFG_DRIVE_STRENGTH)
+ PORT_GP_22(0, fn, sfx), \
+ PORT_GP_28(1, fn, sfx), \
+ PORT_GP_17(2, fn, sfx), \
+ PORT_GP_17(3, fn, sfx), \
+ PORT_GP_6(4, fn, sfx), \
+ PORT_GP_15(5, fn, sfx)
/*
* F_() : just information
* FM() : macro for FN_xxx / xxx_MARK
--
2.18.0
Hi Greg,
These commits are missing from 4.14-stable tree. I have seen your failed
mail for one of them.
(resending as I missed adding stable in the previous mail)
--
Regards
Sudip
changes in v1:
Based on processor mode, action is taken i.e. for user mode, only
user process is killed otherwise as usual kernel will panic and
system halts.
Hari Vyas (1):
arm64: fix kernel panic on serror exception caused by user process
arch/arm64/kernel/traps.c | 7 +++++++
1 file changed, 7 insertions(+)
--
1.9.1
When the devfreq driver and the governor driver are built as modules,
the call to devfreq_add_device() or governor_store() fails because the
governor driver is not loaded at the time the devfreq driver loads. The
devfreq driver has a build dependency on the governor but also should
have a runtime dependency. We need to make sure that the governor driver
is loaded before the devfreq driver.
This patch fixes this bug by adding a try_then_request_governor()
function. First tries to find the governor, and then, if it is not found,
it requests the module and tries again.
Fixes: 1b5c1be2c88e (PM / devfreq: map devfreq drivers to governor using name)
Signed-off-by: Enric Balletbo i Serra <enric.balletbo(a)collabora.com>
---
Changes in v5:
- Requested by MyungJoo and Chanwoo.
- Fix returning without the lock acquired after request_module.
- Requested by Chanwoo.
- In request governor function check if governor name is NULL or not.
- Remove some unrelated changes (added/removed some blank lines).
Changes in v4:
- Kept "locked" devfreq_list from the return of find_devfreq_governor() to
the unlock of governor_store(). Requested by MyungJoo Ham.
Changes in v3:
- Remove unneded change in dev_err message.
- Fix err returned value in case to not find the governor.
Changes in v2:
- Add a new function to request the module and call that function from
devfreq_add_device and governor_store.
drivers/devfreq/devfreq.c | 53 ++++++++++++++++++++++++++++++++++++---
1 file changed, 49 insertions(+), 4 deletions(-)
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 0b5b3abe054e..aa92fbf9f0dd 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -11,6 +11,7 @@
*/
#include <linux/kernel.h>
+#include <linux/kmod.h>
#include <linux/sched.h>
#include <linux/errno.h>
#include <linux/err.h>
@@ -221,6 +222,49 @@ static struct devfreq_governor *find_devfreq_governor(const char *name)
return ERR_PTR(-ENODEV);
}
+/**
+ * try_then_request_governor() - Try to find the governor and request the
+ * module if is not found.
+ * @name: name of the governor
+ *
+ * Search the list of devfreq governors and request the module and try again
+ * if is not found. This can happen when both drivers (the governor driver
+ * and the driver that call devfreq_add_device) are built as modules.
+ * devfreq_list_lock should be held by the caller. Returns the matched
+ * governor's pointer.
+ */
+static struct devfreq_governor *try_then_request_governor(const char *name)
+{
+ struct devfreq_governor *governor;
+ int err = 0;
+
+ if (IS_ERR_OR_NULL(name)) {
+ pr_err("DEVFREQ: %s: Invalid parameters\n", __func__);
+ return ERR_PTR(-EINVAL);
+ }
+ WARN(!mutex_is_locked(&devfreq_list_lock),
+ "devfreq_list_lock must be locked.");
+
+ governor = find_devfreq_governor(name);
+ if (IS_ERR(governor)) {
+ mutex_unlock(&devfreq_list_lock);
+
+ if (!strncmp(name, DEVFREQ_GOV_SIMPLE_ONDEMAND,
+ DEVFREQ_NAME_LEN))
+ err = request_module("governor_%s", "simpleondemand");
+ else
+ err = request_module("governor_%s", name);
+ /* Restore previous state before return */
+ mutex_lock(&devfreq_list_lock);
+ if (err)
+ return NULL;
+
+ governor = find_devfreq_governor(name);
+ }
+
+ return governor;
+}
+
static int devfreq_notify_transition(struct devfreq *devfreq,
struct devfreq_freqs *freqs, unsigned int state)
{
@@ -645,9 +689,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
mutex_unlock(&devfreq->lock);
mutex_lock(&devfreq_list_lock);
- list_add(&devfreq->node, &devfreq_list);
- governor = find_devfreq_governor(devfreq->governor_name);
+ governor = try_then_request_governor(devfreq->governor_name);
if (IS_ERR(governor)) {
dev_err(dev, "%s: Unable to find governor for the device\n",
__func__);
@@ -663,12 +706,14 @@ struct devfreq *devfreq_add_device(struct device *dev,
__func__);
goto err_init;
}
+
+ list_add(&devfreq->node, &devfreq_list);
+
mutex_unlock(&devfreq_list_lock);
return devfreq;
err_init:
- list_del(&devfreq->node);
mutex_unlock(&devfreq_list_lock);
device_unregister(&devfreq->dev);
@@ -989,7 +1034,7 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr,
return -EINVAL;
mutex_lock(&devfreq_list_lock);
- governor = find_devfreq_governor(str_governor);
+ governor = try_then_request_governor(str_governor);
if (IS_ERR(governor)) {
ret = PTR_ERR(governor);
goto out;
--
2.18.0
I'm announcing the release of the 4.17.8 kernel.
This is to fix the i386 issue that was in the 4.17.7 release. All
should be fine now.
The updated 4.17.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.17.y
and can be browsed at the normal kernel.org git web browser:
http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
include/linux/mm.h | 2 +-
mm/page_alloc.c | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
Greg Kroah-Hartman (1):
Linux 4.17.8
Pavel Tatashin (1):
mm: don't do zero_resv_unavail if memmap is not allocated
Fedora has integrated the jitter entropy daemon to work around slow
boot problems, especially on VM's that don't support virtio-rng:
https://bugzilla.redhat.com/show_bug.cgi?id=1572944
It's understandable why they did this, but the Jitter entropy daemon
works fundamentally on the principle: "the CPU microarchitecture is
**so** complicated and we can't figure it out, so it *must* be
random". Yes, it uses statistical tests to "prove" it is secure, but
AES_ENCRYPT(NSA_KEY, COUNTER++) will also pass statistical tests with
flying colors.
So if RDRAND is available, mix it into entropy submitted from
userspace. It can't hurt, and if you believe the NSA has backdoored
RDRAND, then they probably have enough details about the Intel
microarchitecture that they can reverse engineer how the Jitter
entropy daemon affects the microarchitecture, and attack its output
stream. And if RDRAND is in fact an honest DRNG, it will immeasurably
improve on what the Jitter entropy daemon might produce.
This also provides some protection against someone who is able to read
or set the entropy seed file.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
Cc: Arnd Bergmann <arnd(a)arndb.de>
---
Changes in v2:
- Fix silly typo that Arnd pointed out in check the return value of
arch_get_random_int()
- Break out of the loop after the first failure reported by
arch_get_random_int()
drivers/char/random.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c
index cd888d4ee605..bd449ad52442 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1895,14 +1895,22 @@ static int
write_pool(struct entropy_store *r, const char __user *buffer, size_t count)
{
size_t bytes;
- __u32 buf[16];
+ __u32 t, buf[16];
const char __user *p = buffer;
while (count > 0) {
+ int b, i = 0;
+
bytes = min(count, sizeof(buf));
if (copy_from_user(&buf, p, bytes))
return -EFAULT;
+ for (b = bytes ; b > 0 ; b -= sizeof(__u32), i++) {
+ if (!arch_get_random_int(&t))
+ break;
+ buf[i] ^= t;
+ }
+
count -= bytes;
p += bytes;
--
2.18.0.rc0
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
We broke the LVDS notifier resume thing in (presumably) commit
e2c8b8701e2d ("drm/i915: Use atomic helpers for suspend, v2.") as
we no longer duplicate the current state in the LVDS notifier and
thus we never resume it properly either.
Instead of trying to fix it again let's just kill off the lid
notifier entirely. None of the machines tested thus far have
apparently needed it. Originally the lid notifier was added to
work around cases where the VBIOS was clobbering some of the
hardware state behind the driver's back, mostly on Thinkpads.
We now have a few report of Thinkpads working just fine without
the notifier. So maybe it was misdiagnosed originally, or
something else has changed (ACPI video stuff perhaps?).
If we do end up finding a machine where the VBIOS is still causing
problems I would suggest that we first try setting various bits in
the VBIOS scratch registers. There are several to choose from that
may instruct the VBIOS to steer clear.
With the notifier gone we'll also stop looking at the panel status
in ->detect().
Cc: stable(a)vger.kernel.org
Cc: Wolfgang Draxinger <wdraxinger.maillist(a)draxit.de>
Cc: Vito Caputo <vcaputo(a)pengaru.com>
Cc: kitsunyan <kitsunyan(a)airmail.cc>
Cc: Joonas Saarinen <jza(a)saunalahti.fi>
Tested-by: Vito Caputo <vcaputo(a)pengaru.com> # Thinkapd X61s
Tested-by: kitsunyan <kitsunyan(a)airmail.cc> # ThinkPad X200
Tested-by: Joonas Saarinen <jza(a)saunalahti.fi> # Fujitsu Siemens U9210
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105902
References: https://lists.freedesktop.org/archives/intel-gfx/2018-June/169315.html
References: https://bugs.freedesktop.org/show_bug.cgi?id=21230
Fixes: e2c8b8701e2d ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 10 ---
drivers/gpu/drm/i915/i915_drv.h | 2 -
drivers/gpu/drm/i915/intel_lvds.c | 136 +-------------------------------------
3 files changed, 2 insertions(+), 146 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 3834bd758a2e..f8cfd16be534 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -900,7 +900,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
spin_lock_init(&dev_priv->uncore.lock);
mutex_init(&dev_priv->sb_lock);
- mutex_init(&dev_priv->modeset_restore_lock);
mutex_init(&dev_priv->av_mutex);
mutex_init(&dev_priv->wm.wm_mutex);
mutex_init(&dev_priv->pps_mutex);
@@ -1570,11 +1569,6 @@ static int i915_drm_suspend(struct drm_device *dev)
struct pci_dev *pdev = dev_priv->drm.pdev;
pci_power_t opregion_target_state;
- /* ignore lid events during suspend */
- mutex_lock(&dev_priv->modeset_restore_lock);
- dev_priv->modeset_restore = MODESET_SUSPENDED;
- mutex_unlock(&dev_priv->modeset_restore_lock);
-
disable_rpm_wakeref_asserts(dev_priv);
/* We do a lot of poking in a lot of registers, make sure they work
@@ -1770,10 +1764,6 @@ static int i915_drm_resume(struct drm_device *dev)
intel_fbdev_set_suspend(dev, FBINFO_STATE_RUNNING, false);
- mutex_lock(&dev_priv->modeset_restore_lock);
- dev_priv->modeset_restore = MODESET_DONE;
- mutex_unlock(&dev_priv->modeset_restore_lock);
-
intel_opregion_notify_adapter(dev_priv, PCI_D0);
enable_rpm_wakeref_asserts(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4fb937399440..1b0af905b74c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1730,8 +1730,6 @@ struct drm_i915_private {
unsigned long quirks;
- enum modeset_restore modeset_restore;
- struct mutex modeset_restore_lock;
struct drm_atomic_state *modeset_restore_state;
struct drm_modeset_acquire_ctx reset_ctx;
diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c
index ca55b0a82ba6..f9f3b0885ba5 100644
--- a/drivers/gpu/drm/i915/intel_lvds.c
+++ b/drivers/gpu/drm/i915/intel_lvds.c
@@ -44,8 +44,6 @@
/* Private structure for the integrated LVDS support */
struct intel_lvds_connector {
struct intel_connector base;
-
- struct notifier_block lid_notifier;
};
struct intel_lvds_pps {
@@ -452,26 +450,9 @@ static bool intel_lvds_compute_config(struct intel_encoder *intel_encoder,
return true;
}
-/*
- * Detect the LVDS connection.
- *
- * Since LVDS doesn't have hotlug, we use the lid as a proxy. Open means
- * connected and closed means disconnected. We also send hotplug events as
- * needed, using lid status notification from the input layer.
- */
static enum drm_connector_status
intel_lvds_detect(struct drm_connector *connector, bool force)
{
- struct drm_i915_private *dev_priv = to_i915(connector->dev);
- enum drm_connector_status status;
-
- DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
- connector->base.id, connector->name);
-
- status = intel_panel_detect(dev_priv);
- if (status != connector_status_unknown)
- return status;
-
return connector_status_connected;
}
@@ -496,117 +477,6 @@ static int intel_lvds_get_modes(struct drm_connector *connector)
return 1;
}
-static int intel_no_modeset_on_lid_dmi_callback(const struct dmi_system_id *id)
-{
- DRM_INFO("Skipping forced modeset for %s\n", id->ident);
- return 1;
-}
-
-/* The GPU hangs up on these systems if modeset is performed on LID open */
-static const struct dmi_system_id intel_no_modeset_on_lid[] = {
- {
- .callback = intel_no_modeset_on_lid_dmi_callback,
- .ident = "Toshiba Tecra A11",
- .matches = {
- DMI_MATCH(DMI_SYS_VENDOR, "TOSHIBA"),
- DMI_MATCH(DMI_PRODUCT_NAME, "TECRA A11"),
- },
- },
-
- { } /* terminating entry */
-};
-
-/*
- * Lid events. Note the use of 'modeset':
- * - we set it to MODESET_ON_LID_OPEN on lid close,
- * and set it to MODESET_DONE on open
- * - we use it as a "only once" bit (ie we ignore
- * duplicate events where it was already properly set)
- * - the suspend/resume paths will set it to
- * MODESET_SUSPENDED and ignore the lid open event,
- * because they restore the mode ("lid open").
- */
-static int intel_lid_notify(struct notifier_block *nb, unsigned long val,
- void *unused)
-{
- struct intel_lvds_connector *lvds_connector =
- container_of(nb, struct intel_lvds_connector, lid_notifier);
- struct drm_connector *connector = &lvds_connector->base.base;
- struct drm_device *dev = connector->dev;
- struct drm_i915_private *dev_priv = to_i915(dev);
-
- if (dev->switch_power_state != DRM_SWITCH_POWER_ON)
- return NOTIFY_OK;
-
- mutex_lock(&dev_priv->modeset_restore_lock);
- if (dev_priv->modeset_restore == MODESET_SUSPENDED)
- goto exit;
- /*
- * check and update the status of LVDS connector after receiving
- * the LID nofication event.
- */
- connector->status = connector->funcs->detect(connector, false);
-
- /* Don't force modeset on machines where it causes a GPU lockup */
- if (dmi_check_system(intel_no_modeset_on_lid))
- goto exit;
- if (!acpi_lid_open()) {
- /* do modeset on next lid open event */
- dev_priv->modeset_restore = MODESET_ON_LID_OPEN;
- goto exit;
- }
-
- if (dev_priv->modeset_restore == MODESET_DONE)
- goto exit;
-
- /*
- * Some old platform's BIOS love to wreak havoc while the lid is closed.
- * We try to detect this here and undo any damage. The split for PCH
- * platforms is rather conservative and a bit arbitrary expect that on
- * those platforms VGA disabling requires actual legacy VGA I/O access,
- * and as part of the cleanup in the hw state restore we also redisable
- * the vga plane.
- */
- if (!HAS_PCH_SPLIT(dev_priv))
- intel_display_resume(dev);
-
- dev_priv->modeset_restore = MODESET_DONE;
-
-exit:
- mutex_unlock(&dev_priv->modeset_restore_lock);
- return NOTIFY_OK;
-}
-
-static int
-intel_lvds_connector_register(struct drm_connector *connector)
-{
- struct intel_lvds_connector *lvds = to_lvds_connector(connector);
- int ret;
-
- ret = intel_connector_register(connector);
- if (ret)
- return ret;
-
- lvds->lid_notifier.notifier_call = intel_lid_notify;
- if (acpi_lid_notifier_register(&lvds->lid_notifier)) {
- DRM_DEBUG_KMS("lid notifier registration failed\n");
- lvds->lid_notifier.notifier_call = NULL;
- }
-
- return 0;
-}
-
-static void
-intel_lvds_connector_unregister(struct drm_connector *connector)
-{
- struct intel_lvds_connector *lvds = to_lvds_connector(connector);
-
- if (lvds->lid_notifier.notifier_call)
- acpi_lid_notifier_unregister(&lvds->lid_notifier);
-
- intel_connector_unregister(connector);
-}
-
/**
* intel_lvds_destroy - unregister and free LVDS structures
* @connector: connector to free
@@ -639,8 +509,8 @@ static const struct drm_connector_funcs intel_lvds_connector_funcs = {
.fill_modes = drm_helper_probe_single_connector_modes,
.atomic_get_property = intel_digital_connector_atomic_get_property,
.atomic_set_property = intel_digital_connector_atomic_set_property,
- .late_register = intel_lvds_connector_register,
- .early_unregister = intel_lvds_connector_unregister,
+ .late_register = intel_connector_register,
+ .early_unregister = intel_connector_unregister,
.destroy = intel_lvds_destroy,
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
.atomic_duplicate_state = intel_digital_connector_duplicate_state,
@@ -1114,8 +984,6 @@ void intel_lvds_init(struct drm_i915_private *dev_priv)
* 2) check for VBT data
* 3) check to see if LVDS is already on
* if none of the above, no panel
- * 4) make sure lid is open
- * if closed, act like it's not there for now
*/
/*
--
2.16.4
After cpu_stop_queue_two_works() queues the cpu_stop works
for the stopper threads, it releases the locks held for
both threads, which enables preemption, which allows the
following race condition to occur:
On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.
Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.
If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.
Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.
Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Co-Developed-by: Pavankumar Kondeti <pkondeti(a)codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti(a)codeaurora.org>
Cc: stable(a)vger.kernel.org
---
kernel/stop_machine.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..e190d1e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
err = 0;
__cpu_stop_queue_work(stopper1, work1, &wakeq);
__cpu_stop_queue_work(stopper2, work2, &wakeq);
+ /*
+ * The waking up of stopper threads has to happen
+ * in the same scheduling context as the queueing.
+ * Otherwise, there is a possibility of one of the
+ * above stoppers being woken up by another CPU,
+ * and preempting us. This will cause us to n ot
+ * wake up the other stopper forever.
+ */
+ preempt_disable();
unlock:
raw_spin_unlock(&stopper2->lock);
raw_spin_unlock_irq(&stopper1->lock);
@@ -270,7 +279,10 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
goto retry;
}
- wake_up_q(&wakeq);
+ if (!err) {
+ wake_up_q(&wakeq);
+ preempt_enable();
+ }
return err;
}
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
From: Chris Wilson <chris(a)chris-wilson.co.uk>
This was supposed to be a mask of all known rings, but it is being used
by execbuffer to filter out invalid rings, and so is instead mapping high
unused values onto valid rings. Instead of a mask of all known rings,
we need it to be the mask of all possible rings.
Fixes: 549f7365820a ("drm/i915: Enable SandyBridge blitter ring")
Fixes: de1add360522 ("drm/i915: Decouple execbuf uAPI from internal implementation")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v4.6+
---
include/uapi/drm/i915_drm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index eadefaa..51a754e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -966,7 +966,7 @@ struct drm_i915_gem_execbuffer2 {
* struct drm_i915_gem_exec_fence *fences.
*/
__u64 cliprects_ptr;
-#define I915_EXEC_RING_MASK (7<<0)
+#define I915_EXEC_RING_MASK (0x3f)
#define I915_EXEC_DEFAULT (0<<0)
#define I915_EXEC_RENDER (1<<0)
#define I915_EXEC_BSD (2<<0)
--
2.7.4
This is the start of the stable review cycle for the 4.9.113 release.
There are 32 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Jul 18 07:34:43 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.113-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.113-rc1
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
loop: remember whether sysfs_create_group() was done
Leon Romanovsky <leonro(a)mellanox.com>
RDMA/ucm: Mark UCM interface as BROKEN
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
PM / hibernate: Fix oops at snapshot_write()
Theodore Ts'o <tytso(a)mit.edu>
loop: add recursion validation to LOOP_CHANGE_FD
Florian Westphal <fw(a)strlen.de>
netfilter: x_tables: initialise match/target check parameter struct
Eric Dumazet <edumazet(a)google.com>
netfilter: nf_queue: augment nfqa_cfg_policy
Oleg Nesterov <oleg(a)redhat.com>
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
Keith Busch <keith.busch(a)intel.com>
nvme-pci: Remap CMB SQ entries on every controller reset
Steve Wise <swise(a)opengridcomputing.com>
iw_cxgb4: correctly enforce the max reg_mr depth
Jon Hunter <jonathanh(a)nvidia.com>
i2c: tegra: Fix NACK error handling
Paul Menzel <pmenzel(a)molgen.mpg.de>
tools build: fix # escaping in .cmd files for future Make
Oscar Salvador <osalvador(a)suse.de>
fs, elf: make sure to page align bss in load_elf_library
Chris Wilson <chris(a)chris-wilson.co.uk>
ALSA: hda - Handle pm failure during hotplug
Linus Torvalds <torvalds(a)linux-foundation.org>
Fix up non-directory creation in SGID directories
Tomasz Kramkowski <tk(a)the-tk.com>
HID: usbhid: add quirk for innomedia INNEX GENESIS/ATARI adapter
Dan Carpenter <dan.carpenter(a)oracle.com>
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
Nico Sneck <snecknico(a)gmail.com>
usb: quirks: add delay quirks for Corsair Strafe
Johan Hovold <johan(a)kernel.org>
USB: serial: mos7840: fix status-register error handling
Jann Horn <jannh(a)google.com>
USB: yurex: fix out-of-bounds uaccess in read handler
Johan Hovold <johan(a)kernel.org>
USB: serial: keyspan_pda: fix modem-status error handling
Olli Salonen <olli.salonen(a)iki.fi>
USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: serial: ch341: fix type promotion bug in ch341_control_in()
Hans de Goede <hdegoede(a)redhat.com>
ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
Nadav Amit <namit(a)vmware.com>
vmw_balloon: fix inflation with batching
Damien Le Moal <damien.lemoal(a)wdc.com>
ata: Fix ZBC_OUT all bit handling
Damien Le Moal <damien.lemoal(a)wdc.com>
ata: Fix ZBC_OUT command block check
Jann Horn <jannh(a)google.com>
ibmasm: don't write out of bounds in read handler
x00270170 <xiaqing17(a)hisilicon.com>
mmc: dw_mmc: fix card threshold control configuration
Paul Burton <paul.burton(a)mips.com>
MIPS: Fix ioremap() RAM check
Paul Burton <paul.burton(a)mips.com>
MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
Paul Burton <paul.burton(a)mips.com>
MIPS: Call dump_stack() from show_regs()
Scott Bauer <scott.bauer(a)intel.com>
nvme: validate admin queue before unquiesce
-------------
Diffstat:
Makefile | 4 +-
arch/mips/kernel/process.c | 43 ++++++++++++++-------
arch/mips/kernel/traps.c | 1 +
arch/mips/mm/ioremap.c | 37 ++++++++++++------
arch/x86/kernel/uprobes.c | 2 +-
drivers/ata/ahci.c | 59 +++++++++++++++++++++++++++++
drivers/ata/libata-core.c | 3 ++
drivers/ata/libata-scsi.c | 18 ++++++---
drivers/block/loop.c | 79 ++++++++++++++++++++++-----------------
drivers/block/loop.h | 1 +
drivers/hid/hid-ids.h | 3 ++
drivers/hid/usbhid/hid-quirks.c | 1 +
drivers/i2c/busses/i2c-tegra.c | 17 ++++-----
drivers/infiniband/Kconfig | 12 ++++++
drivers/infiniband/core/Makefile | 4 +-
drivers/infiniband/hw/cxgb4/mem.c | 2 +-
drivers/misc/ibmasm/ibmasmfs.c | 27 ++-----------
drivers/misc/vmw_balloon.c | 4 +-
drivers/mmc/host/dw_mmc.c | 7 ++--
drivers/nvme/host/core.c | 3 +-
drivers/nvme/host/pci.c | 27 +++++++------
drivers/usb/core/quirks.c | 4 ++
drivers/usb/host/xhci-mem.c | 2 +-
drivers/usb/misc/yurex.c | 23 +++---------
drivers/usb/serial/ch341.c | 2 +-
drivers/usb/serial/cp210x.c | 1 +
drivers/usb/serial/keyspan_pda.c | 4 +-
drivers/usb/serial/mos7840.c | 3 ++
fs/binfmt_elf.c | 5 +--
fs/inode.c | 6 +++
include/linux/libata.h | 1 +
kernel/power/user.c | 5 +++
net/bridge/netfilter/ebtables.c | 2 +
net/ipv4/netfilter/ip_tables.c | 1 +
net/ipv6/netfilter/ip6_tables.c | 1 +
net/netfilter/nfnetlink_queue.c | 3 ++
sound/pci/hda/patch_hdmi.c | 19 +++++++---
tools/build/Build.include | 4 +-
38 files changed, 287 insertions(+), 153 deletions(-)
Tree/Branch: v4.4.141
Git describe: v4.4.141
Commit: b3c6be58aa Linux 4.4.141
Build Time: 63 min 24 sec
Passed: 10 / 10 (100.00 %)
Failed: 0 / 10 ( 0.00 %)
Errors: 0
Warnings: 31
Section Mismatches: 0
-------------------------------------------------------------------------------
defconfigs with issues (other than build errors):
19 warnings 0 mismatches : arm64-allmodconfig
17 warnings 0 mismatches : x86_64-allmodconfig
-------------------------------------------------------------------------------
Warnings Summary: 31
3 warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
2 ../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
===============================================================================
Detailed per-defconfig build reports below:
-------------------------------------------------------------------------------
arm64-allmodconfig : PASS, 0 errors, 19 warnings, 0 section mismatches
Warnings:
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
x86_64-allmodconfig : PASS, 0 errors, 17 warnings, 0 section mismatches
Warnings:
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
Passed with no errors, warnings or mismatches:
arm64-allnoconfig
arm-multi_v5_defconfig
arm-multi_v7_defconfig
x86_64-defconfig
arm-allmodconfig
arm-allnoconfig
x86_64-allnoconfig
arm64-defconfig
Modern MIPS cores (like P5600/6600, M5150/6520, end so on) which
got L2-cache on chip also can enable a special type Cache-Coherency
attribute (CCA) named UnCached Accelerated attribute (UCA). In this
way uncached accelerated accesses are treated the same way as
non-accelerated uncached accesses, but uncached stores are gathered
together for more efficient bus utilization. So to speak this CCA
enables uncached transactions to better utilize bus bandwidth via
burst transactions.
This is exactly why ioremap_wc() method has been introduced in linux.
Alas MIPS-platform code hasn't implemented it so far, instead default
one has been used which was an alias to ioremap_nocache. In order to
fix this we added MIPS-specific ioremap_wc() macro substituted by
generic __ioremap_mode() method call with writecombine CPU-info
field passed. It shall create real ioremap_wc() method if CPU-cache
supports UCA feature and fall-back to _CACHE_UNCACHED attribute
if one doesn't. Additionally platform-specific io.h shall declare
ARCH_HAS_IOREMAP_WC macro as indication of architectural definition
of ioremap_wc() (similar to x86/powerpc).
Signed-off-by: Serge Semin <fancer.lancer(a)gmail.com>
Singed-off-by: Paul Burton <paul.burton(a)mips.com>
Cc: James Hogan <jhogan(a)kernel.org>
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: linux-mips(a)linux-mips.org
Cc: stable(a)vger.kernel.org
---
arch/mips/include/asm/io.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/arch/mips/include/asm/io.h b/arch/mips/include/asm/io.h
index 4d709b61d..d4f8cdc58 100644
--- a/arch/mips/include/asm/io.h
+++ b/arch/mips/include/asm/io.h
@@ -12,6 +12,8 @@
#ifndef _ASM_IO_H
#define _ASM_IO_H
+#define ARCH_HAS_IOREMAP_WC
+
#include <linux/compiler.h>
#include <linux/kernel.h>
#include <linux/types.h>
@@ -278,6 +280,27 @@ static inline void __iomem * __ioremap_mode(phys_addr_t offset, unsigned long si
#define ioremap_cache ioremap_cachable
/*
+ * ioremap_wc - map bus memory into CPU space
+ * @offset: bus address of the memory
+ * @size: size of the resource to map
+ *
+ * ioremap_wc performs a platform specific sequence of operations to
+ * make bus memory CPU accessible via the readb/readw/readl/writeb/
+ * writew/writel functions and the other mmio helpers. The returned
+ * address is not guaranteed to be usable directly as a virtual
+ * address.
+ *
+ * This version of ioremap ensures that the memory is marked uncachable
+ * but accelerated by means of write-combining feature. It is specifically
+ * useful for PCIe prefetchable windows, which may vastly improve a
+ * communications performance. If it was determined on boot stage, what
+ * CPU CCA doesn't support UCA, the method shall fall-back to the
+ * _CACHE_UNCACHED option (see cpu_probe() method).
+ */
+#define ioremap_wc(offset, size) \
+ __ioremap_mode((offset), (size), boot_cpu_data.writecombine)
+
+/*
* These two are MIPS specific ioremap variant. ioremap_cacheable_cow
* requests a cachable mapping, ioremap_uncached_accelerated requests a
* mapping using the uncached accelerated mode which isn't supported on
--
2.12.0
When KVM emulates a physical timer, we keep track of the interrupt
condition and try to inject an IRQ to the guest when needed.
This works if the timer expires when either the guest is running or KVM
does work on behalf of it (like handling a trap).
However when the guest's VCPU is not scheduled (for instance because
the guest issued a WFI instruction before), we miss injecting the interrupt
when the VCPU's state gets restored back in kvm_timer_vcpu_load().
Fix this by moving the interrupt injection check into the
phys_timer_emulate() function, so that all possible paths of execution
are covered.
Cc: Stable <stable(a)vger.kernel.org> # 4.15+
Fixes: bbdd52cfcba29 ("KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit")
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
---
Changelog v1...v2:
- clear IRQ line *before* starting the soft timer
virt/kvm/arm/arch_timer.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index bd3d57f40f1b..03a4ea776b85 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -294,16 +294,25 @@ static void phys_timer_emulate(struct kvm_vcpu *vcpu)
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+ /* If the timer cannot fire at all, then we don't need a soft timer. */
+ if (!kvm_timer_irq_can_fire(ptimer)) {
+ soft_timer_cancel(&timer->phys_timer, NULL);
+ kvm_timer_update_irq(vcpu, false, ptimer);
+ return;
+ }
+
/*
- * If the timer can fire now we have just raised the IRQ line and we
- * don't need to have a soft timer scheduled for the future. If the
- * timer cannot fire at all, then we also don't need a soft timer.
+ * If the timer can fire now, we don't need to have a soft timer
+ * scheduled for the future, as we also raise the IRQ line.
*/
- if (kvm_timer_should_fire(ptimer) || !kvm_timer_irq_can_fire(ptimer)) {
+ if (kvm_timer_should_fire(ptimer)) {
soft_timer_cancel(&timer->phys_timer, NULL);
+ kvm_timer_update_irq(vcpu, true, ptimer);
+
return;
}
+ kvm_timer_update_irq(vcpu, false, ptimer);
soft_timer_start(&timer->phys_timer, kvm_timer_compute_delta(ptimer));
}
@@ -316,7 +325,6 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
- struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
bool level;
if (unlikely(!timer->enabled))
@@ -332,9 +340,6 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
level = kvm_timer_should_fire(vtimer);
kvm_timer_update_irq(vcpu, level, vtimer);
- if (kvm_timer_should_fire(ptimer) != ptimer->irq.level)
- kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer);
-
phys_timer_emulate(vcpu);
}
--
2.14.4
When KVM emulates a physical timer, we keep track of the interrupt
condition and try to inject an IRQ to the guest when needed.
This works if the timer expires when either the guest is running or KVM
does work on behalf of it, since it calls kvm_timer_update_state().
However when the guest's VCPU is not scheduled (for instance because
the guest issued a WFI instruction before), we miss injecting the interrupt
when the VCPU's state gets restored back in kvm_timer_vcpu_load().
Fix this by moving the interrupt injection check into the
phys_timer_emulate() function, so that all possible paths of execution
are covered.
This fixes the physical timer emulation, which broke when it got changed
in the 4.15 merge window.
The respective kvm-unit-test check has been posted already.
Cc: Stable <stable(a)vger.kernel.org> # 4.15+
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
---
virt/kvm/arm/arch_timer.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index bd3d57f40f1b..1949fb0b80a4 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -294,17 +294,26 @@ static void phys_timer_emulate(struct kvm_vcpu *vcpu)
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+ /* If the timer cannot fire at all, then we don't need a soft timer. */
+ if (!kvm_timer_irq_can_fire(ptimer)) {
+ soft_timer_cancel(&timer->phys_timer, NULL);
+ kvm_timer_update_irq(vcpu, false, ptimer);
+ return;
+ }
+
/*
- * If the timer can fire now we have just raised the IRQ line and we
- * don't need to have a soft timer scheduled for the future. If the
- * timer cannot fire at all, then we also don't need a soft timer.
+ * If the timer can fire now, we don't need to have a soft timer
+ * scheduled for the future, as we also raise the IRQ line.
*/
- if (kvm_timer_should_fire(ptimer) || !kvm_timer_irq_can_fire(ptimer)) {
+ if (kvm_timer_should_fire(ptimer)) {
soft_timer_cancel(&timer->phys_timer, NULL);
+ kvm_timer_update_irq(vcpu, true, ptimer);
+
return;
}
soft_timer_start(&timer->phys_timer, kvm_timer_compute_delta(ptimer));
+ kvm_timer_update_irq(vcpu, false, ptimer);
}
/*
@@ -316,7 +325,6 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
- struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
bool level;
if (unlikely(!timer->enabled))
@@ -332,9 +340,6 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
level = kvm_timer_should_fire(vtimer);
kvm_timer_update_irq(vcpu, level, vtimer);
- if (kvm_timer_should_fire(ptimer) != ptimer->irq.level)
- kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer);
-
phys_timer_emulate(vcpu);
}
--
2.14.4
This fixes quite a number of runtime PM bugs I found that have been
causing some pretty nasty issues such as:
- Deadlocking on boot
- Connector probing potentially not working while the GPU is in runtime
suspend
- i2c char dev not working while the GPU is in runtime suspend
- aux char dev not working while the GPU is in runtime suspend
There's definitely more parts of nouveau that need to be fixed to use
runtime power management correctly, such as the hwmon portions, but this
series just handles the more important fixes that should get into stable
for the time being.
Cc: Karol Herbst <karolherbst(a)gmail.com>
Cc: stable(a)vger.kernel.org
Lyude Paul (5):
drm/nouveau: Prevent RPM callback recursion in suspend/resume paths
drm/nouveau: Grab RPM ref while probing outputs
drm/nouveau: Add missing RPM get/put() when probing connectors
drm/nouveau: Grab RPM ref when i2c bus is in use
drm/nouveau: Grab RPM ref when aux bus is in use
drivers/gpu/drm/nouveau/dispnv50/disp.c | 12 +++++++++--
drivers/gpu/drm/nouveau/nouveau_connector.c | 21 +++++++++++++++++--
drivers/gpu/drm/nouveau/nouveau_connector.h | 3 +++
drivers/gpu/drm/nouveau/nouveau_drm.c | 10 ++++++++-
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c | 12 ++++++++++-
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 12 ++++++++++-
6 files changed, 63 insertions(+), 7 deletions(-)
--
2.17.1
NOTE, this kernel release is broken for i386 systems. If you are
running such a machine, do NOT update to this release, you will not be
able to boot properly.
I did this release anyway with this known problem as there is a fix in
here for x86-64 systems that was nasty to track down and was affecting
people. Given that the huge majority of systems are NOT i386, I felt
this was a safe release to do at this point in time.
Once the proper fix for i386 systems has been accepted into Linus's tree
(it has been posted already), I will pick it up and do a new 4.17.y
release so that users of those systems can update.
All users of non-i386 systems running the 4.17 kernel series must upgrade.
The updated 4.17.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.17.y
and can be browsed at the normal kernel.org git web browser:
http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/kbuild/kbuild.txt | 9
Makefile | 2
arch/arm/boot/dts/armada-38x.dtsi | 2
arch/arm64/include/asm/simd.h | 19
arch/mips/kernel/process.c | 43 -
arch/mips/kernel/traps.c | 1
arch/mips/mm/ioremap.c | 37 -
arch/x86/crypto/Makefile | 4
arch/x86/crypto/salsa20-i586-asm_32.S | 938 --------------------------
arch/x86/crypto/salsa20-x86_64-asm_64.S | 805 ----------------------
arch/x86/crypto/salsa20_glue.c | 91 --
arch/x86/include/asm/vmx.h | 3
arch/x86/kernel/uprobes.c | 2
arch/x86/kvm/vmx.c | 67 +
arch/x86/kvm/x86.h | 9
arch/x86/purgatory/Makefile | 2
arch/x86/xen/enlighten_pv.c | 25
arch/x86/xen/irq.c | 4
block/bsg.c | 2
crypto/Kconfig | 28
crypto/sha3_generic.c | 2
drivers/acpi/acpica/hwsleep.c | 15
drivers/acpi/nfit/core.c | 44 -
drivers/acpi/nfit/nfit.h | 1
drivers/ata/ahci.c | 60 +
drivers/ata/libata-core.c | 3
drivers/ata/libata-scsi.c | 18
drivers/block/loop.c | 79 +-
drivers/block/loop.h | 1
drivers/gpu/drm/etnaviv/etnaviv_drv.c | 24
drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 3
drivers/gpu/drm/etnaviv/etnaviv_sched.c | 24
drivers/i2c/busses/i2c-tegra.c | 17
drivers/i2c/i2c-core-base.c | 11
drivers/infiniband/Kconfig | 11
drivers/infiniband/core/Makefile | 4
drivers/infiniband/hw/cxgb4/mem.c | 2
drivers/infiniband/hw/hfi1/rc.c | 2
drivers/infiniband/hw/hfi1/uc.c | 4
drivers/infiniband/hw/hfi1/ud.c | 4
drivers/infiniband/hw/hfi1/verbs_txreq.c | 4
drivers/infiniband/hw/hfi1/verbs_txreq.h | 4
drivers/misc/ibmasm/ibmasmfs.c | 27
drivers/misc/mei/interrupt.c | 5
drivers/misc/vmw_balloon.c | 4
drivers/mmc/host/dw_mmc.c | 7
drivers/mmc/host/renesas_sdhi_internal_dmac.c | 3
drivers/mmc/host/sdhci-esdhc-imx.c | 21
drivers/mtd/spi-nor/cadence-quadspi.c | 6
drivers/staging/rtl8723bs/core/rtw_ap.c | 2
drivers/staging/rtlwifi/rtl8822be/hw.c | 2
drivers/staging/rtlwifi/wifi.h | 1
drivers/thunderbolt/domain.c | 4
drivers/usb/core/quirks.c | 4
drivers/usb/host/xhci-mem.c | 2
drivers/usb/misc/yurex.c | 23
drivers/usb/serial/ch341.c | 2
drivers/usb/serial/cp210x.c | 1
drivers/usb/serial/keyspan_pda.c | 4
drivers/usb/serial/mos7840.c | 3
fs/binfmt_elf.c | 5
fs/f2fs/f2fs.h | 13
fs/f2fs/inode.c | 33
fs/f2fs/node.c | 21
fs/f2fs/segment.c | 25
fs/inode.c | 6
fs/proc/task_mmu.c | 3
fs/xfs/libxfs/xfs_ialloc_btree.c | 2
include/linux/libata.h | 1
kernel/bpf/verifier.c | 48 -
kernel/power/user.c | 5
kernel/trace/trace.c | 8
kernel/trace/trace_kprobe.c | 6
kernel/trace/trace_output.c | 5
mm/gup.c | 2
mm/mmap.c | 29
mm/page_alloc.c | 4
mm/rmap.c | 8
net/bridge/netfilter/ebtables.c | 2
net/ipv4/netfilter/ip_tables.c | 1
net/ipv6/netfilter/ip6_tables.c | 1
net/netfilter/nfnetlink_queue.c | 3
sound/pci/hda/patch_hdmi.c | 19
sound/pci/hda/patch_realtek.c | 6
tools/build/Build.include | 4
tools/testing/selftests/bpf/test_verifier.c | 58 +
86 files changed, 701 insertions(+), 2168 deletions(-)
Alexander Usyskin (1):
mei: discard messages from not connected client during power down.
Baruch Siach (1):
ARM: dts: armada-38x: use the new thermal binding
Chris Wilson (1):
ALSA: hda - Handle pm failure during hotplug
Christian Borntraeger (1):
mm: do not drop unused pages when userfaultd is running
Damien Le Moal (2):
ata: Fix ZBC_OUT command block check
ata: Fix ZBC_OUT all bit handling
Dan Carpenter (2):
USB: serial: ch341: fix type promotion bug in ch341_control_in()
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
Dan Williams (1):
acpi, nfit: Fix scrub idle detection
Daniel Borkmann (1):
bpf: reject passing modified ctx to helper functions
Darrick J. Wong (1):
xfs: fix inobt magic number check
Dmitry Vyukov (1):
crypto: don't optimize keccakf()
Eric Biggers (1):
crypto: x86/salsa20 - remove x86 salsa20 implementations
Eric Dumazet (1):
netfilter: nf_queue: augment nfqa_cfg_policy
Fabio Estevam (2):
drm/etnaviv: Check for platform_device_register_simple() failure
drm/etnaviv: Fix driver unregistering
Florian Westphal (1):
netfilter: x_tables: initialise match/target check parameter struct
Greg Kroah-Hartman (1):
Linux 4.17.7
Hans de Goede (1):
ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
Hui Wang (1):
ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
Jaegeuk Kim (4):
f2fs: give message and set need_fsck given broken node id
f2fs: avoid bug_on on corrupted inode
f2fs: sanity check on sit entry
f2fs: sanity check for total valid node blocks
Jann Horn (2):
ibmasm: don't write out of bounds in read handler
USB: yurex: fix out-of-bounds uaccess in read handler
Jiri Olsa (1):
tracing/kprobe: Release kprobe print_fmt properly
Joel Fernandes (Google) (1):
tracing: Reorder display of TGID to be after PID
Johan Hovold (2):
USB: serial: keyspan_pda: fix modem-status error handling
USB: serial: mos7840: fix status-register error handling
Jon Hunter (1):
i2c: tegra: Fix NACK error handling
Juergen Gross (2):
xen: remove global bit from __default_kernel_pte_mask for pv guests
xen: setup pv irq ops vector earlier
Leon Romanovsky (1):
RDMA/ucm: Mark UCM interface as BROKEN
Linus Torvalds (1):
Fix up non-directory creation in SGID directories
Lucas Stach (1):
drm/etnaviv: bring back progress check in job timeout handler
Marc Orr (1):
kvm: vmx: Nested VM-entry prereqs for event inj.
Michael J. Ruhl (1):
IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
Michal Hocko (1):
mm: do not bug_on on incorrect length in __mm_populate()
Mika Westerberg (2):
ahci: Add Intel Ice Lake LP PCI ID
thunderbolt: Notify userspace when boot_acl is changed
Murray McAllister (1):
staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
Nadav Amit (1):
vmw_balloon: fix inflation with batching
Nico Sneck (1):
usb: quirks: add delay quirks for Corsair Strafe
Oleg Nesterov (1):
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
Olli Salonen (1):
USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
Oscar Salvador (1):
fs, elf: make sure to page align bss in load_elf_library
Paul Burton (3):
MIPS: Call dump_stack() from show_regs()
MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
MIPS: Fix ioremap() RAM check
Paul Menzel (1):
tools build: fix # escaping in .cmd files for future Make
Pavel Tatashin (1):
mm: zero unavailable pages before memmap init
Philipp Rudo (1):
x86/purgatory: add missing FORCE to Makefile target
Ping-Ke Shih (1):
staging: r8822be: Fix RTL8822be can't find any wireless AP
Rafael J. Wysocki (1):
ACPICA: Clear status of all events when entering S5
Randy Dunlap (1):
kbuild: delete INSTALL_FW_PATH from kbuild documentation
Stefan Agner (1):
mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
Steve Wise (1):
iw_cxgb4: correctly enforce the max reg_mr depth
Tetsuo Handa (2):
PM / hibernate: Fix oops at snapshot_write()
loop: remember whether sysfs_create_group() was done
Theodore Ts'o (1):
loop: add recursion validation to LOOP_CHANGE_FD
Tony Battersby (1):
bsg: fix bogus EINVAL on non-data commands
Vignesh R (1):
mtd: spi-nor: cadence-quadspi: Fix direct mode write timeouts
Vlastimil Babka (1):
fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
Wolfram Sang (1):
i2c: recovery: if possible send STOP with recovery pulses
Yandong Zhao (1):
arm64: neon: Fix function may_use_simd() return error status
Yoshihiro Shimoda (1):
mmc: renesas_sdhi_internal_dmac: Cannot clear the RX_IN_USE in abort
x00270170 (1):
mmc: dw_mmc: fix card threshold control configuration
This is the start of the stable review cycle for the 4.17.7 release.
There are 67 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Jul 18 07:34:11 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.17.7-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.17.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.17.7-rc1
Baruch Siach <baruch(a)tkos.co.il>
ARM: dts: armada-38x: use the new thermal binding
Jaegeuk Kim <jaegeuk(a)kernel.org>
f2fs: sanity check for total valid node blocks
Jaegeuk Kim <jaegeuk(a)kernel.org>
f2fs: sanity check on sit entry
Jaegeuk Kim <jaegeuk(a)kernel.org>
f2fs: avoid bug_on on corrupted inode
Jaegeuk Kim <jaegeuk(a)kernel.org>
f2fs: give message and set need_fsck given broken node id
Marc Orr <marcorr(a)google.com>
kvm: vmx: Nested VM-entry prereqs for event inj.
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
loop: remember whether sysfs_create_group() was done
Leon Romanovsky <leon(a)kernel.org>
RDMA/ucm: Mark UCM interface as BROKEN
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
PM / hibernate: Fix oops at snapshot_write()
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: fix inobt magic number check
Theodore Ts'o <tytso(a)mit.edu>
loop: add recursion validation to LOOP_CHANGE_FD
Florian Westphal <fw(a)strlen.de>
netfilter: x_tables: initialise match/target check parameter struct
Dmitry Vyukov <dvyukov(a)google.com>
crypto: don't optimize keccakf()
Eric Dumazet <edumazet(a)google.com>
netfilter: nf_queue: augment nfqa_cfg_policy
Oleg Nesterov <oleg(a)redhat.com>
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
Eric Biggers <ebiggers(a)google.com>
crypto: x86/salsa20 - remove x86 salsa20 implementations
Tony Battersby <tonyb(a)cybernetics.com>
bsg: fix bogus EINVAL on non-data commands
Juergen Gross <jgross(a)suse.com>
xen: setup pv irq ops vector earlier
Juergen Gross <jgross(a)suse.com>
xen: remove global bit from __default_kernel_pte_mask for pv guests
Steve Wise <swise(a)opengridcomputing.com>
iw_cxgb4: correctly enforce the max reg_mr depth
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: recovery: if possible send STOP with recovery pulses
Jon Hunter <jonathanh(a)nvidia.com>
i2c: tegra: Fix NACK error handling
Michael J. Ruhl <michael.j.ruhl(a)intel.com>
IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
Paul Menzel <pmenzel(a)molgen.mpg.de>
tools build: fix # escaping in .cmd files for future Make
Yandong Zhao <yandong77520(a)gmail.com>
arm64: neon: Fix function may_use_simd() return error status
Dan Williams <dan.j.williams(a)intel.com>
acpi, nfit: Fix scrub idle detection
Randy Dunlap <rdunlap(a)infradead.org>
kbuild: delete INSTALL_FW_PATH from kbuild documentation
Joel Fernandes (Google) <joel(a)joelfernandes.org>
tracing: Reorder display of TGID to be after PID
Michal Hocko <mhocko(a)suse.com>
mm: do not bug_on on incorrect length in __mm_populate()
Oscar Salvador <osalvador(a)suse.de>
fs, elf: make sure to page align bss in load_elf_library
Philipp Rudo <prudo(a)linux.ibm.com>
x86/purgatory: add missing FORCE to Makefile target
Vlastimil Babka <vbabka(a)suse.cz>
fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
Christian Borntraeger <borntraeger(a)de.ibm.com>
mm: do not drop unused pages when userfaultd is running
Chris Wilson <chris(a)chris-wilson.co.uk>
ALSA: hda - Handle pm failure during hotplug
Hui Wang <hui.wang(a)canonical.com>
ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
Pavel Tatashin <pasha.tatashin(a)oracle.com>
mm: zero unavailable pages before memmap init
Linus Torvalds <torvalds(a)linux-foundation.org>
Fix up non-directory creation in SGID directories
Dan Carpenter <dan.carpenter(a)oracle.com>
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
Nico Sneck <snecknico(a)gmail.com>
usb: quirks: add delay quirks for Corsair Strafe
Johan Hovold <johan(a)kernel.org>
USB: serial: mos7840: fix status-register error handling
Jann Horn <jannh(a)google.com>
USB: yurex: fix out-of-bounds uaccess in read handler
Johan Hovold <johan(a)kernel.org>
USB: serial: keyspan_pda: fix modem-status error handling
Olli Salonen <olli.salonen(a)iki.fi>
USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: serial: ch341: fix type promotion bug in ch341_control_in()
Mika Westerberg <mika.westerberg(a)linux.intel.com>
thunderbolt: Notify userspace when boot_acl is changed
Hans de Goede <hdegoede(a)redhat.com>
ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
Mika Westerberg <mika.westerberg(a)linux.intel.com>
ahci: Add Intel Ice Lake LP PCI ID
Nadav Amit <namit(a)vmware.com>
vmw_balloon: fix inflation with batching
Jiri Olsa <jolsa(a)kernel.org>
tracing/kprobe: Release kprobe print_fmt properly
Vignesh R <vigneshr(a)ti.com>
mtd: spi-nor: cadence-quadspi: Fix direct mode write timeouts
Alexander Usyskin <alexander.usyskin(a)intel.com>
mei: discard messages from not connected client during power down.
Damien Le Moal <damien.lemoal(a)wdc.com>
ata: Fix ZBC_OUT all bit handling
Damien Le Moal <damien.lemoal(a)wdc.com>
ata: Fix ZBC_OUT command block check
Ping-Ke Shih <pkshih(a)realtek.com>
staging: r8822be: Fix RTL8822be can't find any wireless AP
Murray McAllister <murray.mcallister(a)insomniasec.com>
staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
Jann Horn <jannh(a)google.com>
ibmasm: don't write out of bounds in read handler
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
mmc: renesas_sdhi_internal_dmac: Cannot clear the RX_IN_USE in abort
x00270170 <xiaqing17(a)hisilicon.com>
mmc: dw_mmc: fix card threshold control configuration
Stefan Agner <stefan(a)agner.ch>
mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPICA: Clear status of all events when entering S5
Lucas Stach <l.stach(a)pengutronix.de>
drm/etnaviv: bring back progress check in job timeout handler
Fabio Estevam <fabio.estevam(a)nxp.com>
drm/etnaviv: Fix driver unregistering
Fabio Estevam <fabio.estevam(a)nxp.com>
drm/etnaviv: Check for platform_device_register_simple() failure
Paul Burton <paul.burton(a)mips.com>
MIPS: Fix ioremap() RAM check
Paul Burton <paul.burton(a)mips.com>
MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
Paul Burton <paul.burton(a)mips.com>
MIPS: Call dump_stack() from show_regs()
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: reject passing modified ctx to helper functions
-------------
Diffstat:
Documentation/kbuild/kbuild.txt | 9 -
Makefile | 4 +-
arch/arm/boot/dts/armada-38x.dtsi | 2 +-
arch/arm64/include/asm/simd.h | 19 +-
arch/mips/kernel/process.c | 43 +-
arch/mips/kernel/traps.c | 1 +
arch/mips/mm/ioremap.c | 37 +-
arch/x86/crypto/Makefile | 4 -
arch/x86/crypto/salsa20-i586-asm_32.S | 938 --------------------------
arch/x86/crypto/salsa20-x86_64-asm_64.S | 805 ----------------------
arch/x86/crypto/salsa20_glue.c | 91 ---
arch/x86/include/asm/vmx.h | 3 +
arch/x86/kernel/uprobes.c | 2 +-
arch/x86/kvm/vmx.c | 67 ++
arch/x86/kvm/x86.h | 9 +
arch/x86/purgatory/Makefile | 2 +-
arch/x86/xen/enlighten_pv.c | 25 +-
arch/x86/xen/irq.c | 4 +-
block/bsg.c | 2 -
crypto/Kconfig | 28 -
crypto/sha3_generic.c | 2 +-
drivers/acpi/acpica/hwsleep.c | 15 +-
drivers/acpi/nfit/core.c | 44 +-
drivers/acpi/nfit/nfit.h | 1 +
drivers/ata/ahci.c | 60 ++
drivers/ata/libata-core.c | 3 +
drivers/ata/libata-scsi.c | 18 +-
drivers/block/loop.c | 79 ++-
drivers/block/loop.h | 1 +
drivers/gpu/drm/etnaviv/etnaviv_drv.c | 24 +-
drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 3 +
drivers/gpu/drm/etnaviv/etnaviv_sched.c | 24 +
drivers/i2c/busses/i2c-tegra.c | 17 +-
drivers/i2c/i2c-core-base.c | 11 +-
drivers/infiniband/Kconfig | 11 +
drivers/infiniband/core/Makefile | 4 +-
drivers/infiniband/hw/cxgb4/mem.c | 2 +-
drivers/infiniband/hw/hfi1/rc.c | 2 +-
drivers/infiniband/hw/hfi1/uc.c | 4 +-
drivers/infiniband/hw/hfi1/ud.c | 4 +-
drivers/infiniband/hw/hfi1/verbs_txreq.c | 4 +-
drivers/infiniband/hw/hfi1/verbs_txreq.h | 4 +-
drivers/misc/ibmasm/ibmasmfs.c | 27 +-
drivers/misc/mei/interrupt.c | 5 +-
drivers/misc/vmw_balloon.c | 4 +-
drivers/mmc/host/dw_mmc.c | 7 +-
drivers/mmc/host/renesas_sdhi_internal_dmac.c | 3 +-
drivers/mmc/host/sdhci-esdhc-imx.c | 21 +-
drivers/mtd/spi-nor/cadence-quadspi.c | 6 +-
drivers/staging/rtl8723bs/core/rtw_ap.c | 2 +-
drivers/staging/rtlwifi/rtl8822be/hw.c | 2 +-
drivers/staging/rtlwifi/wifi.h | 1 +
drivers/thunderbolt/domain.c | 4 +
drivers/usb/core/quirks.c | 4 +
drivers/usb/host/xhci-mem.c | 2 +-
drivers/usb/misc/yurex.c | 23 +-
drivers/usb/serial/ch341.c | 2 +-
drivers/usb/serial/cp210x.c | 1 +
drivers/usb/serial/keyspan_pda.c | 4 +-
drivers/usb/serial/mos7840.c | 3 +
fs/binfmt_elf.c | 5 +-
fs/f2fs/f2fs.h | 13 +-
fs/f2fs/inode.c | 33 +-
fs/f2fs/node.c | 21 +-
fs/f2fs/segment.c | 25 +
fs/inode.c | 6 +
fs/proc/task_mmu.c | 3 +-
fs/xfs/libxfs/xfs_ialloc_btree.c | 2 +-
include/linux/libata.h | 1 +
kernel/bpf/verifier.c | 48 +-
kernel/power/user.c | 5 +
kernel/trace/trace.c | 8 +-
kernel/trace/trace_kprobe.c | 6 +-
kernel/trace/trace_output.c | 5 +-
mm/gup.c | 2 -
mm/mmap.c | 29 +-
mm/page_alloc.c | 4 +-
mm/rmap.c | 8 +-
net/bridge/netfilter/ebtables.c | 2 +
net/ipv4/netfilter/ip_tables.c | 1 +
net/ipv6/netfilter/ip6_tables.c | 1 +
net/netfilter/nfnetlink_queue.c | 3 +
sound/pci/hda/patch_hdmi.c | 19 +-
sound/pci/hda/patch_realtek.c | 6 +-
tools/build/Build.include | 4 +-
tools/testing/selftests/bpf/test_verifier.c | 58 +-
86 files changed, 702 insertions(+), 2169 deletions(-)
We have got a team of professional to do image editing service for you.
We have 20 image editors and on daily basis 2000 images can be processed.
If you want to check our quality of work please send us a photo with
instruction and we will work on it.
Our Services:
Photo cut out, masking, clipping path
Color, brightness and contrast correction
Beauty, Model retouching, skin retouching
Image cropping and resizing
Correcting the shape and size
We do unlimited revisions until you are satisfied with the work.
Thanks,
Ruby Young
This is a note to let you know that I've just added the patch titled
usb: dwc2: Fix DMA alignment to start at allocated boundary
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 56406e017a883b54b339207b230f85599f4d70ae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Antti=20Sepp=C3=A4l=C3=A4?= <a.seppala(a)gmail.com>
Date: Thu, 5 Jul 2018 17:31:53 +0300
Subject: usb: dwc2: Fix DMA alignment to start at allocated boundary
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit 3bc04e28a030 ("usb: dwc2: host: Get aligned DMA in a more
supported way") introduced a common way to align DMA allocations.
The code in the commit aligns the struct dma_aligned_buffer but the
actual DMA address pointed by data[0] gets aligned to an offset from
the allocated boundary by the kmalloc_ptr and the old_xfer_buffer
pointers.
This is against the recommendation in Documentation/DMA-API.txt which
states:
Therefore, it is recommended that driver writers who don't take
special care to determine the cache line size at run time only map
virtual regions that begin and end on page boundaries (which are
guaranteed also to be cache line boundaries).
The effect of this is that architectures with non-coherent DMA caches
may run into memory corruption or kernel crashes with Unhandled
kernel unaligned accesses exceptions.
Fix the alignment by positioning the DMA area in front of the allocation
and use memory at the end of the area for storing the orginal
transfer_buffer pointer. This may have the added benefit of increased
performance as the DMA area is now fully aligned on all architectures.
Tested with Lantiq xRX200 (MIPS) and RPi Model B Rev 2 (ARM).
Fixes: 3bc04e28a030 ("usb: dwc2: host: Get aligned DMA in a more supported way")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Douglas Anderson <dianders(a)chromium.org>
Signed-off-by: Antti Seppälä <a.seppala(a)gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi(a)linux.intel.com>
---
drivers/usb/dwc2/hcd.c | 44 ++++++++++++++++++++++--------------------
1 file changed, 23 insertions(+), 21 deletions(-)
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index b1104be3429c..2ed0ac18e053 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -2665,34 +2665,29 @@ static int dwc2_alloc_split_dma_aligned_buf(struct dwc2_hsotg *hsotg,
#define DWC2_USB_DMA_ALIGN 4
-struct dma_aligned_buffer {
- void *kmalloc_ptr;
- void *old_xfer_buffer;
- u8 data[0];
-};
-
static void dwc2_free_dma_aligned_buffer(struct urb *urb)
{
- struct dma_aligned_buffer *temp;
+ void *stored_xfer_buffer;
if (!(urb->transfer_flags & URB_ALIGNED_TEMP_BUFFER))
return;
- temp = container_of(urb->transfer_buffer,
- struct dma_aligned_buffer, data);
+ /* Restore urb->transfer_buffer from the end of the allocated area */
+ memcpy(&stored_xfer_buffer, urb->transfer_buffer +
+ urb->transfer_buffer_length, sizeof(urb->transfer_buffer));
if (usb_urb_dir_in(urb))
- memcpy(temp->old_xfer_buffer, temp->data,
+ memcpy(stored_xfer_buffer, urb->transfer_buffer,
urb->transfer_buffer_length);
- urb->transfer_buffer = temp->old_xfer_buffer;
- kfree(temp->kmalloc_ptr);
+ kfree(urb->transfer_buffer);
+ urb->transfer_buffer = stored_xfer_buffer;
urb->transfer_flags &= ~URB_ALIGNED_TEMP_BUFFER;
}
static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags)
{
- struct dma_aligned_buffer *temp, *kmalloc_ptr;
+ void *kmalloc_ptr;
size_t kmalloc_size;
if (urb->num_sgs || urb->sg ||
@@ -2700,22 +2695,29 @@ static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags)
!((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1)))
return 0;
- /* Allocate a buffer with enough padding for alignment */
+ /*
+ * Allocate a buffer with enough padding for original transfer_buffer
+ * pointer. This allocation is guaranteed to be aligned properly for
+ * DMA
+ */
kmalloc_size = urb->transfer_buffer_length +
- sizeof(struct dma_aligned_buffer) + DWC2_USB_DMA_ALIGN - 1;
+ sizeof(urb->transfer_buffer);
kmalloc_ptr = kmalloc(kmalloc_size, mem_flags);
if (!kmalloc_ptr)
return -ENOMEM;
- /* Position our struct dma_aligned_buffer such that data is aligned */
- temp = PTR_ALIGN(kmalloc_ptr + 1, DWC2_USB_DMA_ALIGN) - 1;
- temp->kmalloc_ptr = kmalloc_ptr;
- temp->old_xfer_buffer = urb->transfer_buffer;
+ /*
+ * Position value of original urb->transfer_buffer pointer to the end
+ * of allocation for later referencing
+ */
+ memcpy(kmalloc_ptr + urb->transfer_buffer_length,
+ &urb->transfer_buffer, sizeof(urb->transfer_buffer));
+
if (usb_urb_dir_out(urb))
- memcpy(temp->data, urb->transfer_buffer,
+ memcpy(kmalloc_ptr, urb->transfer_buffer,
urb->transfer_buffer_length);
- urb->transfer_buffer = temp->data;
+ urb->transfer_buffer = kmalloc_ptr;
urb->transfer_flags |= URB_ALIGNED_TEMP_BUFFER;
--
2.18.0
This is a note to let you know that I've just added the patch titled
usb: gadget: Fix OS descriptors support
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 50b9773c13bffbef32060e67c4483ea7b2eca7b5 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Date: Wed, 27 Jun 2018 12:33:56 +1000
Subject: usb: gadget: Fix OS descriptors support
The current code is broken as it re-defines "req" inside the
if block, then goto out of it. Thus the request that ends
up being sent is not the one that was populated by the
code in question.
This fixes RNDIS driver autodetect by Windows 10 for me.
The bug was introduced by Chris rework to remove the local
queuing inside the if { } block of the redefined request.
Fixes: 636ba13aec8a ("usb: gadget: composite: remove duplicated code in OS desc handling")
Cc: <stable(a)vger.kernel.org> # v4.17
Signed-off-by: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Signed-off-by: Felipe Balbi <felipe.balbi(a)linux.intel.com>
---
drivers/usb/gadget/composite.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index d2fa071c21b1..b8a15840b4ff 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -1819,7 +1819,6 @@ composite_setup(struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl)
if (cdev->use_os_string && cdev->os_desc_config &&
(ctrl->bRequestType & USB_TYPE_VENDOR) &&
ctrl->bRequest == cdev->b_vendor_code) {
- struct usb_request *req;
struct usb_configuration *os_desc_cfg;
u8 *buf;
int interface;
--
2.18.0
From: Paul Burton <paul.burton(a)mips.com>
The current MIPS implementation of arch_trigger_cpumask_backtrace() is
broken because it attempts to use synchronous IPIs despite the fact that
it may be run with interrupts disabled.
This means that when arch_trigger_cpumask_backtrace() is invoked, for
example by the RCU CPU stall watchdog, we may:
- Deadlock due to use of synchronous IPIs with interrupts disabled,
causing the CPU that's attempting to generate the backtrace output
to hang itself.
- Not succeed in generating the desired output from remote CPUs.
- Produce warnings about this from smp_call_function_many(), for
example:
[42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks:
[42760.535755] 0-...!: (1 GPs behind) idle=ade/140000000000000/0 softirq=526944/526945 fqs=0
[42760.547874] 1-...!: (0 ticks this GP) idle=e4a/140000000000000/0 softirq=547885/547885 fqs=0
[42760.559869] (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33)
[42760.568927] ------------[ cut here ]------------
[42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 smp_call_function_many+0x88/0x20c
[42760.587839] Modules linked in:
[42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 4.15.4-00373-gee058bb4d0c2 #2
[42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fffffff0 00000007 00000006 00000000 8e09bca8
[42760.616937] 95b2b379 95b2b379 807a0080 00000007 81944518 0000018a 00000032 00000000
[42760.630095] 00000000 00000030 80000000 00000000 806eca74 00000009 8017e2b8 000001a0
[42760.643169] 00000000 00000002 00000000 8e09baa4 00000008 808b8008 86d69080 8e09bca0
[42760.656282] 8e09ad50 805e20aa 00000000 00000000 00000000 8017e2b8 00000009 801070ca
[42760.669424] ...
[42760.673919] Call Trace:
[42760.678672] [<27fde568>] show_stack+0x70/0xf0
[42760.685417] [<84751641>] dump_stack+0xaa/0xd0
[42760.692188] [<699d671c>] __warn+0x80/0x92
[42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36
[42760.705912] [<f7c76c1c>] smp_call_function_many+0x88/0x20c
[42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a
[42760.722216] [<f845bd33>] rcu_dump_cpu_stacks+0x6a/0x98
[42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac
[42760.737476] [<059b3b43>] update_process_times+0x18/0x34
[42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38
[42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50
[42760.759882] [<e56ea39f>] __hrtimer_run_queues+0xc6/0x226
[42760.767418] [<e88bbcae>] hrtimer_interrupt+0x88/0x19a
[42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a
[42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168
[42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86
[42760.805545] [<b2ada1c7>] gic_irq_dispatch+0xa/0x14
[42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.820086] [<c7521934>] do_IRQ+0x16/0x20
[42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94
[42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78
[42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c
[42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c
[42760.855693] [<ab9fc705>] flush_tlb_mm+0x2a/0x98
[42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44
[42760.869628] [<cb259b74>] arch_tlb_finish_mmu+0x26/0x3e
[42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66
[42760.883907] [<b3fce717>] exit_mmap+0x76/0xea
[42760.890428] [<c4c8a2f6>] mmput+0x80/0x11a
[42760.896632] [<a41a08f4>] do_exit+0x1f4/0x80c
[42760.903158] [<ee01cef6>] do_group_exit+0x20/0x7e
[42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e
[42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c
[42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c
[42760.931765] ---[ end trace 02aa09da9dc52a60 ]---
[42760.938342] ------------[ cut here ]------------
[42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 smp_call_function_single+0xee/0xf8
...
This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async
IPIs & smp_call_function_single_async() in order to resolve this
problem. We ensure use of the pre-allocated call_single_data_t
structures is serialized by maintaining a cpumask indicating that
they're busy, and refusing to attempt to send an IPI when a CPU's bit is
set in this mask. This should only happen if a CPU hasn't responded to a
previous backtrace IPI - ie. if it's hung - and we print a warning to
the console in this case.
I've marked this for stable branches as far back as v4.9, to which it
applies cleanly. Strictly speaking the faulty MIPS implementation can be
traced further back to commit 856839b76836 ("MIPS: Add
arch_trigger_all_cpu_backtrace() function") in v3.19, but kernel
versions v3.19 through v4.8 will require further work to backport due to
the rework performed in commit 9a01c3ed5cdb ("nmi_backtrace: add more
trigger_*_cpu_backtrace() methods").
Signed-off-by: Paul Burton <paul.burton(a)mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19597/
Cc: James Hogan <jhogan(a)kernel.org>
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: linux-mips(a)linux-mips.org
Cc: stable(a)vger.kernel.org
Fixes: 856839b76836 ("MIPS: Add arch_trigger_all_cpu_backtrace() function")
Fixes: 9a01c3ed5cdb ("nmi_backtrace: add more trigger_*_cpu_backtrace() methods")
[ Huacai: backported to 4.9: Replace "call_single_data_t" with "struct call_single_data" ]
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/mips/kernel/process.c | 45 ++++++++++++++++++++++++++++++---------------
1 file changed, 30 insertions(+), 15 deletions(-)
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index cb1e9c1..513a63b 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -26,6 +26,7 @@
#include <linux/kallsyms.h>
#include <linux/random.h>
#include <linux/prctl.h>
+#include <linux/nmi.h>
#include <asm/asm.h>
#include <asm/bootinfo.h>
@@ -633,28 +634,42 @@ unsigned long arch_align_stack(unsigned long sp)
return sp & ALMASK;
}
-static void arch_dump_stack(void *info)
-{
- struct pt_regs *regs;
+static DEFINE_PER_CPU(struct call_single_data, backtrace_csd);
+static struct cpumask backtrace_csd_busy;
- regs = get_irq_regs();
-
- if (regs)
- show_regs(regs);
- else
- dump_stack();
+static void handle_backtrace(void *info)
+{
+ nmi_cpu_backtrace(get_irq_regs());
+ cpumask_clear_cpu(smp_processor_id(), &backtrace_csd_busy);
}
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+static void raise_backtrace(cpumask_t *mask)
{
- long this_cpu = get_cpu();
+ struct call_single_data *csd;
+ int cpu;
- if (cpumask_test_cpu(this_cpu, mask) && !exclude_self)
- dump_stack();
+ for_each_cpu(cpu, mask) {
+ /*
+ * If we previously sent an IPI to the target CPU & it hasn't
+ * cleared its bit in the busy cpumask then it didn't handle
+ * our previous IPI & it's not safe for us to reuse the
+ * call_single_data_t.
+ */
+ if (cpumask_test_and_set_cpu(cpu, &backtrace_csd_busy)) {
+ pr_warn("Unable to send backtrace IPI to CPU%u - perhaps it hung?\n",
+ cpu);
+ continue;
+ }
- smp_call_function_many(mask, arch_dump_stack, NULL, 1);
+ csd = &per_cpu(backtrace_csd, cpu);
+ csd->func = handle_backtrace;
+ smp_call_function_single_async(cpu, csd);
+ }
+}
- put_cpu();
+void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+{
+ nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
}
int mips_get_process_fp_mode(struct task_struct *task)
--
2.7.0
This is the start of the stable review cycle for the 4.14.56 release.
There are 54 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Jul 18 07:34:24 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.56-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.56-rc1
Jaegeuk Kim <jaegeuk(a)kernel.org>
f2fs: give message and set need_fsck given broken node id
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
loop: remember whether sysfs_create_group() was done
Leon Romanovsky <leonro(a)mellanox.com>
RDMA/ucm: Mark UCM interface as BROKEN
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
PM / hibernate: Fix oops at snapshot_write()
Theodore Ts'o <tytso(a)mit.edu>
loop: add recursion validation to LOOP_CHANGE_FD
Florian Westphal <fw(a)strlen.de>
netfilter: x_tables: initialise match/target check parameter struct
Eric Dumazet <edumazet(a)google.com>
netfilter: nf_queue: augment nfqa_cfg_policy
Oleg Nesterov <oleg(a)redhat.com>
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
Eric Biggers <ebiggers(a)google.com>
crypto: x86/salsa20 - remove x86 salsa20 implementations
Keith Busch <keith.busch(a)intel.com>
nvme-pci: Remap CMB SQ entries on every controller reset
Juergen Gross <jgross(a)suse.com>
xen: setup pv irq ops vector earlier
Steve Wise <swise(a)opengridcomputing.com>
iw_cxgb4: correctly enforce the max reg_mr depth
Jon Hunter <jonathanh(a)nvidia.com>
i2c: tegra: Fix NACK error handling
Michael J. Ruhl <michael.j.ruhl(a)intel.com>
IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
Paul Menzel <pmenzel(a)molgen.mpg.de>
tools build: fix # escaping in .cmd files for future Make
Yandong Zhao <yandong77520(a)gmail.com>
arm64: neon: Fix function may_use_simd() return error status
Randy Dunlap <rdunlap(a)infradead.org>
kbuild: delete INSTALL_FW_PATH from kbuild documentation
Joel Fernandes (Google) <joel(a)joelfernandes.org>
tracing: Reorder display of TGID to be after PID
Michal Hocko <mhocko(a)suse.com>
mm: do not bug_on on incorrect length in __mm_populate()
Oscar Salvador <osalvador(a)suse.de>
fs, elf: make sure to page align bss in load_elf_library
Vlastimil Babka <vbabka(a)suse.cz>
fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
Christian Borntraeger <borntraeger(a)de.ibm.com>
mm: do not drop unused pages when userfaultd is running
Chris Wilson <chris(a)chris-wilson.co.uk>
ALSA: hda - Handle pm failure during hotplug
Hui Wang <hui.wang(a)canonical.com>
ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
Ming Lei <ming.lei(a)redhat.com>
scsi: megaraid_sas: fix selection of reply queue
Shivasharan S <shivasharan.srikanteshwara(a)broadcom.com>
scsi: megaraid_sas: Create separate functions to allocate ctrl memory
Shivasharan S <shivasharan.srikanteshwara(a)broadcom.com>
scsi: megaraid_sas: replace is_ventura with adapter_type checks
Shivasharan S <shivasharan.srikanteshwara(a)broadcom.com>
scsi: megaraid_sas: replace instance->ctrl_context checks with instance->adapter_type
Shivasharan S <shivasharan.srikanteshwara(a)broadcom.com>
scsi: megaraid_sas: use adapter_type for all gen controllers
Christoph Hellwig <hch(a)lst.de>
genirq/affinity: assign vectors to all possible CPUs
Linus Torvalds <torvalds(a)linux-foundation.org>
Fix up non-directory creation in SGID directories
Christian Brauner <christian.brauner(a)ubuntu.com>
devpts: resolve devpts bind-mounts
Christian Brauner <christian.brauner(a)ubuntu.com>
devpts: hoist out check for DEVPTS_SUPER_MAGIC
Dan Carpenter <dan.carpenter(a)oracle.com>
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
Nico Sneck <snecknico(a)gmail.com>
usb: quirks: add delay quirks for Corsair Strafe
Johan Hovold <johan(a)kernel.org>
USB: serial: mos7840: fix status-register error handling
Jann Horn <jannh(a)google.com>
USB: yurex: fix out-of-bounds uaccess in read handler
Johan Hovold <johan(a)kernel.org>
USB: serial: keyspan_pda: fix modem-status error handling
Olli Salonen <olli.salonen(a)iki.fi>
USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: serial: ch341: fix type promotion bug in ch341_control_in()
Hans de Goede <hdegoede(a)redhat.com>
ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
Nadav Amit <namit(a)vmware.com>
vmw_balloon: fix inflation with batching
Damien Le Moal <damien.lemoal(a)wdc.com>
ata: Fix ZBC_OUT all bit handling
Damien Le Moal <damien.lemoal(a)wdc.com>
ata: Fix ZBC_OUT command block check
Ping-Ke Shih <pkshih(a)realtek.com>
staging: r8822be: Fix RTL8822be can't find any wireless AP
Murray McAllister <murray.mcallister(a)insomniasec.com>
staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
Jann Horn <jannh(a)google.com>
ibmasm: don't write out of bounds in read handler
x00270170 <xiaqing17(a)hisilicon.com>
mmc: dw_mmc: fix card threshold control configuration
Stefan Agner <stefan(a)agner.ch>
mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
Paul Burton <paul.burton(a)mips.com>
MIPS: Fix ioremap() RAM check
Paul Burton <paul.burton(a)mips.com>
MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
Paul Burton <paul.burton(a)mips.com>
MIPS: Call dump_stack() from show_regs()
Kai Chieh Chuang <kaichieh.chuang(a)mediatek.com>
ASoC: mediatek: preallocate pages use platform device
Sean Young <sean(a)mess.org>
media: rc: mce_kbd decoder: fix stuck keys
-------------
Diffstat:
Documentation/kbuild/kbuild.txt | 9 -
Makefile | 4 +-
arch/arm64/include/asm/simd.h | 19 +-
arch/mips/kernel/process.c | 43 +-
arch/mips/kernel/traps.c | 1 +
arch/mips/mm/ioremap.c | 37 +-
arch/x86/crypto/Makefile | 4 -
arch/x86/crypto/salsa20-i586-asm_32.S | 1114 --------------------
arch/x86/crypto/salsa20-x86_64-asm_64.S | 919 ----------------
arch/x86/crypto/salsa20_glue.c | 116 --
arch/x86/kernel/uprobes.c | 2 +-
arch/x86/xen/enlighten_pv.c | 24 +-
arch/x86/xen/irq.c | 4 +-
crypto/Kconfig | 26 -
drivers/ata/ahci.c | 59 ++
drivers/ata/libata-core.c | 3 +
drivers/ata/libata-scsi.c | 18 +-
drivers/block/loop.c | 79 +-
drivers/block/loop.h | 1 +
drivers/i2c/busses/i2c-tegra.c | 17 +-
drivers/infiniband/Kconfig | 12 +
drivers/infiniband/core/Makefile | 4 +-
drivers/infiniband/hw/cxgb4/mem.c | 2 +-
drivers/infiniband/hw/hfi1/rc.c | 2 +-
drivers/infiniband/hw/hfi1/uc.c | 4 +-
drivers/infiniband/hw/hfi1/ud.c | 4 +-
drivers/infiniband/hw/hfi1/verbs_txreq.c | 4 +-
drivers/infiniband/hw/hfi1/verbs_txreq.h | 4 +-
drivers/media/rc/ir-mce_kbd-decoder.c | 2 +
drivers/misc/ibmasm/ibmasmfs.c | 27 +-
drivers/misc/vmw_balloon.c | 4 +-
drivers/mmc/host/dw_mmc.c | 7 +-
drivers/mmc/host/sdhci-esdhc-imx.c | 21 +-
drivers/nvme/host/pci.c | 27 +-
drivers/scsi/megaraid/megaraid_sas.h | 10 +-
drivers/scsi/megaraid/megaraid_sas_base.c | 296 ++++--
drivers/scsi/megaraid/megaraid_sas_fp.c | 20 +-
drivers/scsi/megaraid/megaraid_sas_fusion.c | 54 +-
drivers/scsi/megaraid/megaraid_sas_fusion.h | 7 -
drivers/staging/rtl8723bs/core/rtw_ap.c | 2 +-
drivers/staging/rtlwifi/rtl8822be/hw.c | 2 +-
drivers/staging/rtlwifi/wifi.h | 1 +
drivers/usb/core/quirks.c | 4 +
drivers/usb/host/xhci-mem.c | 2 +-
drivers/usb/misc/yurex.c | 23 +-
drivers/usb/serial/ch341.c | 2 +-
drivers/usb/serial/cp210x.c | 1 +
drivers/usb/serial/keyspan_pda.c | 4 +-
drivers/usb/serial/mos7840.c | 3 +
fs/binfmt_elf.c | 5 +-
fs/devpts/inode.c | 48 +-
fs/f2fs/f2fs.h | 13 +-
fs/f2fs/inode.c | 13 +-
fs/f2fs/node.c | 21 +-
fs/inode.c | 6 +
fs/proc/task_mmu.c | 2 +-
include/linux/libata.h | 1 +
kernel/irq/affinity.c | 30 +-
kernel/power/user.c | 5 +
kernel/trace/trace.c | 8 +-
kernel/trace/trace_output.c | 5 +-
mm/gup.c | 2 -
mm/mmap.c | 29 +-
mm/rmap.c | 8 +-
net/bridge/netfilter/ebtables.c | 2 +
net/ipv4/netfilter/ip_tables.c | 1 +
net/ipv6/netfilter/ip6_tables.c | 1 +
net/netfilter/nfnetlink_queue.c | 3 +
sound/pci/hda/patch_hdmi.c | 19 +-
sound/pci/hda/patch_realtek.c | 6 +-
.../soc/mediatek/common/mtk-afe-platform-driver.c | 4 +-
tools/build/Build.include | 4 +-
72 files changed, 665 insertions(+), 2625 deletions(-)
SEV guest fails to update the UEFI runtime variables stored in the
flash. commit 1379edd59673 ("x86/efi: Access EFI data as encrypted
when SEV is active") unconditionally maps all the UEFI runtime data
as 'encrypted' (C=1). When SEV is active the UEFI runtime data marked
as EFI_MEMORY_MAPPED_IO should be mapped as 'unencrypted' so that both
guest and hypervisor can access the data.
Fixes: 1379edd59673 (x86/efi: Access EFI data as encrypted ...)
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: linux-efi(a)vger.kernel.org
Cc: kvm(a)vger.kernel.org
Cc: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Cc: Matt Fleming <matt(a)codeblueprint.co.uk>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: <stable(a)vger.kernel.org> # 4.15.x
Signed-off-by: Brijesh Singh <brijesh.singh(a)amd.com>
---
arch/x86/platform/efi/efi_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 77873ce..5f2eb32 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -417,7 +417,7 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
if (!(md->attribute & EFI_MEMORY_WB))
flags |= _PAGE_PCD;
- if (sev_active())
+ if (sev_active() && md->type != EFI_MEMORY_MAPPED_IO)
flags |= _PAGE_ENC;
pfn = md->phys_addr >> PAGE_SHIFT;
--
2.7.4
The patch titled
Subject: mm: do not bug_on on incorrect length in __mm_populate()
has been removed from the -mm tree. Its filename was
mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Michal Hocko <mhocko(a)suse.com>
Subject: mm: do not bug_on on incorrect length in __mm_populate()
syzbot has noticed that a specially crafted library can easily hit
VM_BUG_ON in __mm_populate
localhost login: [ 81.210241] emacs (9634) used greatest stack depth: 10416 bytes left
[ 140.099935] ------------[ cut here ]------------
[ 140.101904] kernel BUG at mm/gup.c:1242!
[ 140.103572] invalid opcode: 0000 [#1] SMP
[ 140.105220] CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
[ 140.107762] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[ 140.112000] RIP: 0010:__mm_populate+0x1e2/0x1f0
[ 140.113875] Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
[ 140.121403] RSP: 0018:ffffc90000dffd78 EFLAGS: 00010293
[ 140.123516] RAX: ffff8801366c63c0 RBX: 000000007bf81000 RCX: ffffffff813e4ee2
[ 140.126352] RDX: 0000000000000000 RSI: 0000000000007676 RDI: 000000007bf81000
[ 140.129236] RBP: ffffc90000dffdc0 R08: 0000000000000000 R09: 0000000000000000
[ 140.132110] R10: ffff880135895c80 R11: 0000000000000000 R12: 0000000000007676
[ 140.134955] R13: 0000000000008000 R14: 0000000000000000 R15: 0000000000007676
[ 140.137785] FS: 0000000000000000(0000) GS:ffff88013a680000(0063) knlGS:00000000f7db9700
[ 140.140998] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 140.143303] CR2: 00000000f7ea56e0 CR3: 0000000134674004 CR4: 00000000000606e0
[ 140.145906] Call Trace:
[ 140.146728] vm_brk_flags+0xc3/0x100
[ 140.147830] vm_brk+0x1f/0x30
[ 140.148714] load_elf_library+0x281/0x2e0
[ 140.149875] __ia32_sys_uselib+0x170/0x1e0
[ 140.151028] ? copy_overflow+0x30/0x30
[ 140.152105] ? __ia32_sys_uselib+0x170/0x1e0
[ 140.153301] do_fast_syscall_32+0xca/0x420
[ 140.154455] entry_SYSENTER_compat+0x70/0x7f
The reason is that the length of the new brk is not page aligned when we
try to populate the it. There is no reason to bug on that though.
do_brk_flags already aligns the length properly so the mapping is expanded
as it should. All we need is to tell mm_populate about it. Besides that
there is absolutely no reason to to bug_on in the first place. The worst
thing that could happen is that the last page wouldn't get populated and
that is far from putting system into an inconsistent state.
Fix the issue by moving the length sanitization code from do_brk_flags up
to vm_brk_flags. The only other caller of do_brk_flags is brk syscall
entry and it makes sure to provide the proper length so t here is no need
for sanitation and so we can use do_brk_flags without it.
Also remove the bogus BUG_ONs.
[osalvador(a)techadventures.net: fix up vm_brk_flags s@request@len@]
Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06(a)syzkaller.appspotmail.com>
Tested-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: Zi Yan <zi.yan(a)cs.rutgers.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michael S. Tsirkin <mst(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 2 --
mm/mmap.c | 29 ++++++++++++-----------------
2 files changed, 12 insertions(+), 19 deletions(-)
diff -puN mm/gup.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate mm/gup.c
--- a/mm/gup.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate
+++ a/mm/gup.c
@@ -1238,8 +1238,6 @@ int __mm_populate(unsigned long start, u
int locked = 0;
long ret = 0;
- VM_BUG_ON(start & ~PAGE_MASK);
- VM_BUG_ON(len != PAGE_ALIGN(len));
end = start + len;
for (nstart = start; nstart < end; nstart = nend) {
diff -puN mm/mmap.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate mm/mmap.c
--- a/mm/mmap.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate
+++ a/mm/mmap.c
@@ -186,8 +186,8 @@ static struct vm_area_struct *remove_vma
return next;
}
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf);
-
+static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags,
+ struct list_head *uf);
SYSCALL_DEFINE1(brk, unsigned long, brk)
{
unsigned long retval;
@@ -245,7 +245,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
goto out;
/* Ok, looks good - let it rip. */
- if (do_brk(oldbrk, newbrk-oldbrk, &uf) < 0)
+ if (do_brk_flags(oldbrk, newbrk-oldbrk, 0, &uf) < 0)
goto out;
set_brk:
@@ -2929,21 +2929,14 @@ static inline void verify_mm_writelocked
* anonymous maps. eventually we may be able to do some
* brk-specific accounting here.
*/
-static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags, struct list_head *uf)
+static int do_brk_flags(unsigned long addr, unsigned long len, unsigned long flags, struct list_head *uf)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma, *prev;
- unsigned long len;
struct rb_node **rb_link, *rb_parent;
pgoff_t pgoff = addr >> PAGE_SHIFT;
int error;
- len = PAGE_ALIGN(request);
- if (len < request)
- return -ENOMEM;
- if (!len)
- return 0;
-
/* Until we need other flags, refuse anything except VM_EXEC. */
if ((flags & (~VM_EXEC)) != 0)
return -EINVAL;
@@ -3015,18 +3008,20 @@ out:
return 0;
}
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf)
-{
- return do_brk_flags(addr, len, 0, uf);
-}
-
-int vm_brk_flags(unsigned long addr, unsigned long len, unsigned long flags)
+int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags)
{
struct mm_struct *mm = current->mm;
+ unsigned long len;
int ret;
bool populate;
LIST_HEAD(uf);
+ len = PAGE_ALIGN(request);
+ if (len < request)
+ return -ENOMEM;
+ if (!len)
+ return 0;
+
if (down_write_killable(&mm->mmap_sem))
return -EINTR;
_
Patches currently in -mm which might be from mhocko(a)suse.com are
mm-drop-vm_bug_on-from-__get_free_pages.patch
memcg-oom-move-out_of_memory-back-to-the-charge-path.patch
mm-oom-remove-sleep-from-under-oom_lock.patch
mm-oom-document-oom_lock.patch
mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2.patch
The patch titled
Subject: fs, elf: make sure to page align bss in load_elf_library
has been removed from the -mm tree. Its filename was
fs-elf-make-sure-to-page-align-bss-in-load_elf_library.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Oscar Salvador <osalvador(a)suse.de>
Subject: fs, elf: make sure to page align bss in load_elf_library
The current code does not make sure to page align bss before calling
vm_brk(), and this can lead to a VM_BUG_ON() in __mm_populate()
due to the requested lenght not being correctly aligned.
Let us make sure to align it properly.
Kees: only applicable to CONFIG_USELIB kernels: 32-bit and configured for
libc5.
Link: http://lkml.kernel.org/r/20180705145539.9627-1-osalvador@techadventures.net
Signed-off-by: Oscar Salvador <osalvador(a)suse.de>
Reported-by: syzbot+5dcb560fe12aa5091c06(a)syzkaller.appspotmail.com
Tested-by: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nicolas Pitre <nicolas.pitre(a)linaro.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/binfmt_elf.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff -puN fs/binfmt_elf.c~fs-elf-make-sure-to-page-align-bss-in-load_elf_library fs/binfmt_elf.c
--- a/fs/binfmt_elf.c~fs-elf-make-sure-to-page-align-bss-in-load_elf_library
+++ a/fs/binfmt_elf.c
@@ -1259,9 +1259,8 @@ static int load_elf_library(struct file
goto out_free_ph;
}
- len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
- ELF_MIN_ALIGN - 1);
- bss = eppnt->p_memsz + eppnt->p_vaddr;
+ len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);
+ bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);
if (bss > len) {
error = vm_brk(len, bss - len);
if (error)
_
Patches currently in -mm which might be from osalvador(a)suse.de are
mm-memory_hotplug-make-add_memory_resource-use-__try_online_node.patch
mm-memory_hotplug-call-register_mem_sect_under_node.patch
mm-memory_hotplug-make-register_mem_sect_under_node-a-cb-of-walk_memory_range.patch
mm-memory_hotplug-drop-unnecessary-checks-from-register_mem_sect_under_node.patch
mm-sparse-make-sparse_init_one_section-void-and-remove-check.patch
The patch titled
Subject: x86/purgatory: add missing FORCE to Makefile target
has been removed from the -mm tree. Its filename was
x86-purgatory-add-missing-force-to-makefile-target.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Philipp Rudo <prudo(a)linux.ibm.com>
Subject: x86/purgatory: add missing FORCE to Makefile target
- Build the kernel without the fix
- Add some flag to the purgatories KBUILD_CFLAGS,I used
-fno-asynchronous-unwind-tables
- Re-build the kernel
When you look at makes output you see that sha256.o is not re-build in the
last step. Also readelf -S still shows the .eh_frame section for
sha256.o.
With the fix sha256.o is rebuilt in the last step.
Without FORCE make does not detect changes only made to the command line
options. So object files might not be re-built even when they should be.
Fix this by adding FORCE where it is missing.
Link: http://lkml.kernel.org/r/20180704110044.29279-2-prudo@linux.ibm.com
Fixes: df6f2801f511 ("kernel/kexec_file.c: move purgatories sha256 to common code")
Signed-off-by: Philipp Rudo <prudo(a)linux.ibm.com>
Acked-by: Dave Young <dyoung(a)redhat.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/purgatory/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN arch/x86/purgatory/Makefile~x86-purgatory-add-missing-force-to-makefile-target arch/x86/purgatory/Makefile
--- a/arch/x86/purgatory/Makefile~x86-purgatory-add-missing-force-to-makefile-target
+++ a/arch/x86/purgatory/Makefile
@@ -6,7 +6,7 @@ purgatory-y := purgatory.o stack.o setup
targets += $(purgatory-y)
PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
-$(obj)/sha256.o: $(srctree)/lib/sha256.c
+$(obj)/sha256.o: $(srctree)/lib/sha256.c FORCE
$(call if_changed_rule,cc_o_c)
LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z nodefaultlib
_
Patches currently in -mm which might be from prudo(a)linux.ibm.com are
The patch titled
Subject: fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
has been removed from the -mm tree. Its filename was
mm-fix-locked-field-in-proc-pid-smaps.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
Thomas reports:
: While looking around in /proc on my v4.14.52 system I noticed that
: all processes got a lot of "Locked" memory in /proc/*/smaps. A lot
: more memory than a regular user can usually lock with mlock().
:
: commit 493b0e9d945fa9dfe96be93ae41b4ca4b6fdb317 (v4.14-rc1) seems
: to have changed the behavior of "Locked".
:
: Before that commit the code was like this. Notice the VM_LOCKED
: check.
:
: (vma->vm_flags & VM_LOCKED) ?
: (unsigned long)(mss.pss >> (10 + PSS_SHIFT)) : 0);
:
: After that commit Locked is now the same as Pss. This looks like a
: mistake.
:
: (unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
Indeed, the commit has added mss->pss_locked with the correct value that
depends on VM_LOCKED, but forgot to actually use it. Fix it.
Link: http://lkml.kernel.org/r/ebf6c7fb-fec3-6a26-544f-710ed193c154@suse.cz
Fixes: 493b0e9d945f ("mm: add /proc/pid/smaps_rollup")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Thomas Lindroth <thomas.lindroth(a)gmail.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: Daniel Colascione <dancol(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff -puN fs/proc/task_mmu.c~mm-fix-locked-field-in-proc-pid-smaps fs/proc/task_mmu.c
--- a/fs/proc/task_mmu.c~mm-fix-locked-field-in-proc-pid-smaps
+++ a/fs/proc/task_mmu.c
@@ -831,7 +831,8 @@ static int show_smap(struct seq_file *m,
SEQ_PUT_DEC(" kB\nSwap: ", mss->swap);
SEQ_PUT_DEC(" kB\nSwapPss: ",
mss->swap_pss >> PSS_SHIFT);
- SEQ_PUT_DEC(" kB\nLocked: ", mss->pss >> PSS_SHIFT);
+ SEQ_PUT_DEC(" kB\nLocked: ",
+ mss->pss_locked >> PSS_SHIFT);
seq_puts(m, " kB\n");
}
if (!rollup_mode) {
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations.patch
The patch titled
Subject: mm: do not drop unused pages when userfaultd is running
has been removed from the -mm tree. Its filename was
mm-do-not-drop-unused-pages-when-userfaultd-is-running.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Christian Borntraeger <borntraeger(a)de.ibm.com>
Subject: mm: do not drop unused pages when userfaultd is running
KVM guests on s390 can notify the host of unused pages. This can result
in pte_unused callbacks to be true for KVM guest memory.
If a page is unused (checked with pte_unused) we might drop this page
instead of paging it. This can have side-effects on userfaultd, when the
page in question was already migrated:
The next access of that page will trigger a fault and a user fault instead
of faulting in a new and empty zero page. As QEMU does not expect a
userfault on an already migrated page this migration will fail.
The most straightforward solution is to ignore the pte_unused hint if a
userfault context is active for this VMA.
Link: http://lkml.kernel.org/r/20180703171854.63981-1-borntraeger@de.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Cornelia Huck <cohuck(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/rmap.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff -puN mm/rmap.c~mm-do-not-drop-unused-pages-when-userfaultd-is-running mm/rmap.c
--- a/mm/rmap.c~mm-do-not-drop-unused-pages-when-userfaultd-is-running
+++ a/mm/rmap.c
@@ -64,6 +64,7 @@
#include <linux/backing-dev.h>
#include <linux/page_idle.h>
#include <linux/memremap.h>
+#include <linux/userfaultfd_k.h>
#include <asm/tlbflush.h>
@@ -1481,11 +1482,16 @@ static bool try_to_unmap_one(struct page
set_pte_at(mm, address, pvmw.pte, pteval);
}
- } else if (pte_unused(pteval)) {
+ } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) {
/*
* The guest indicated that the page content is of no
* interest anymore. Simply discard the pte, vmscan
* will take care of the rest.
+ * A future reference will then fault in a new zero
+ * page. When userfaultfd is active, we must not drop
+ * this page though, as its main user (postcopy
+ * migration) will not expect userfaults on already
+ * copied pages.
*/
dec_mm_counter(mm, mm_counter(page));
/* We have to invalidate as we cleared the pte */
_
Patches currently in -mm which might be from borntraeger(a)de.ibm.com are
With commit eca0fa28cd0d ("perf record: Provide detailed information on s390 CPU")
s390 platform provides detailed type/model/capacitiy information
in the CPU indentifier string instead of just "IBM/S390".
This breaks perf kvm support which uses hard coded string IBM/S390 to
compare with the CPU identifier string. Fix this by changing the comparison.
Fixes: eca0fa28cd0d ("perf record: Provide detailed information on s390 CPU")
Cc: Stefan Raspl <raspl(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org> # 4.17
Signed-off-by: Thomas Richter <tmricht(a)linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner(a)linux.ibm.com>
---
tools/perf/arch/s390/util/kvm-stat.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/arch/s390/util/kvm-stat.c b/tools/perf/arch/s390/util/kvm-stat.c
index d233e2eb9592..aaabab5e2830 100644
--- a/tools/perf/arch/s390/util/kvm-stat.c
+++ b/tools/perf/arch/s390/util/kvm-stat.c
@@ -102,7 +102,7 @@ const char * const kvm_skip_events[] = {
int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid)
{
- if (strstr(cpuid, "IBM/S390")) {
+ if (strstr(cpuid, "IBM")) {
kvm->exit_reasons = sie_exit_reasons;
kvm->exit_reasons_isa = "SIE";
} else
--
2.14.3
From: David Woodhouse <dwmw(a)amazon.co.uk>
commit def9331a12977770cc6132d79f8e6565871e8e38 upstream
When running as Xen pv guest X86_BUG_SYSRET_SS_ATTRS must not be set
on AMD cpus.
This bug/feature bit is kind of special as it will be used very early
when switching threads. Setting the bit and clearing it a little bit
later leaves a critical window where things can go wrong. This time
window has enlarged a little bit by using setup_clear_cpu_cap() instead
of the hypervisor's set_cpu_features callback. It seems this larger
window now makes it rather easy to hit the problem.
The proper solution is to never set the bit in case of Xen.
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa(a)csail.mit.edu>
Reviewed-by: Matt Helsley (VMware) <matt.helsley(a)gmail.com>
Reviewed-by: Alexey Makhalov <amakhalov(a)vmware.com>
Reviewed-by: Bo Gan <ganb(a)vmware.com>
---
arch/x86/kernel/cpu/amd.c | 5 +++--
arch/x86/xen/enlighten.c | 4 +---
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f4fb8f5..9b29414 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -791,8 +791,9 @@ static void init_amd(struct cpuinfo_x86 *c)
if (cpu_has(c, X86_FEATURE_3DNOW) || cpu_has(c, X86_FEATURE_LM))
set_cpu_cap(c, X86_FEATURE_3DNOWPREFETCH);
- /* AMD CPUs don't reset SS attributes on SYSRET */
- set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS);
+ /* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */
+ if (!cpu_has(c, X86_FEATURE_XENPV))
+ set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS);
}
#ifdef CONFIG_X86_32
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 2d7ab4e..82fd84d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -462,10 +462,8 @@ static void __init xen_init_cpuid_mask(void)
static void __init xen_init_capabilities(void)
{
- if (xen_pv_domain()) {
- setup_clear_cpu_cap(X86_BUG_SYSRET_SS_ATTRS);
+ if (xen_pv_domain())
setup_force_cpu_cap(X86_FEATURE_XENPV);
- }
}
static void xen_set_debugreg(int reg, unsigned long val)
On 15/07/18 01:32, kernelci.org bot wrote:
> mainline/master boot: 177 boots: 2 failed, 174 passed with 1 conflict (v4.18-rc4-160-gf353078f028f)
>
> Full Boot Summary: https://kernelci.org/boot/all/job/mainline/branch/master/kernel/v4.18-rc4-1…
> Full Build Summary: https://kernelci.org/build/mainline/branch/master/kernel/v4.18-rc4-160-gf35…
>
> Tree: mainline
> Branch: master
> Git Describe: v4.18-rc4-160-gf353078f028f
> Git Commit: f353078f028fbfe9acd4b747b4a19c69ef6846cd
> Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> Tested: 67 unique boards, 25 SoC families, 21 builds out of 199
>
> Boot Regressions Detected:
[...]
> x86:
>
> i386_defconfig:
> x86-celeron:
> lab-mhart: new failure (last pass: v4.18-rc4-147-g2db39a2f491a)
> x86-pentium4:
> lab-mhart: new failure (last pass: v4.18-rc4-147-g2db39a2f491a)
Please see below an automated bisection report for this
regression. Several bisections were run on other x86 platforms
with i386_defconfig on a few revisions up to v4.18-rc5, they all
reached the same "bad" commit.
Unfortunately there isn't much to learn from the kernelci.org
boot logs as the kernel seems to crash very early on:
https://kernelci.org/boot/all/job/mainline/branch/master/kernel/v4.18-rc5/https://storage.kernelci.org/mainline/master/v4.18-rc4-160-gf353078f028f/x8…
It looks like stable-rc/linux-4.17.y is also broken with
i386_defconfig, which tends to confirm the "bad" commit found by
the automated bisection which was applied there as well:
https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.17.y/kernel/v4.1…
The automated bisection on kernelci.org is still quite new, so
please take the results with a pinch of salt as the "bad" commit
found may not be the actual root cause of the boot failure.
Hope this helps!
Best wishes,
Guillaume
--------------------------------------8<--------------------------------------
Bisection result for mainline/master (v4.18-rc4-160-gf353078f028f) on x86-celeron
Good: 2db39a2f491a Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Bad: f353078f028f Merge branch 'akpm' (patches from Andrew)
Found: e181ae0c5db9 mm: zero unavailable pages before memmap init
Checks:
revert: PASS
verify: PASS
Parameters:
Tree: mainline
URL: http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Branch: master
Target: x86-celeron
Lab: lab-mhart
Config: i386_defconfig
Plan: boot
Breaking commit found:
-------------------------------------------------------------------------------
commit e181ae0c5db9544de9c53239eb22bc012ce75033
Author: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Date: Sat Jul 14 09:15:07 2018 -0400
mm: zero unavailable pages before memmap init
We must zero struct pages for memory that is not backed by physical
memory, or kernel does not have access to.
Recently, there was a change which zeroed all memmap for all holes in
e820. Unfortunately, it introduced a bug that is discussed here:
https://www.spinics.net/lists/linux-mm/msg156764.html
Linus, also saw this bug on his machine, and confirmed that reverting
commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved") fixes the issue.
The problem is that we incorrectly zero some struct pages after they
were setup.
The fix is to zero unavailable struct pages prior to initializing of
struct pages.
A more detailed fix should come later that would avoid double zeroing
cases: one in __init_single_page(), the other one in
zero_resv_unavail().
Fixes: 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved")
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1521100f1e63..5d800d61ddb7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6847,6 +6847,7 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
/* Initialise every node */
mminit_verify_pageflags_layout();
setup_nr_node_ids();
+ zero_resv_unavail();
for_each_online_node(nid) {
pg_data_t *pgdat = NODE_DATA(nid);
free_area_init_node(nid, NULL,
@@ -6857,7 +6858,6 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
node_set_state(nid, N_MEMORY);
check_for_memory(pgdat, nid);
}
- zero_resv_unavail();
}
static int __init cmdline_parse_core(char *p, unsigned long *core,
@@ -7033,9 +7033,9 @@ void __init set_dma_reserve(unsigned long new_dma_reserve)
void __init free_area_init(unsigned long *zones_size)
{
+ zero_resv_unavail();
free_area_init_node(0, zones_size,
__pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL);
- zero_resv_unavail();
}
static int page_alloc_cpu_dead(unsigned int cpu)
-------------------------------------------------------------------------------
Git bisection log:
-------------------------------------------------------------------------------
git bisect start
# good: [2db39a2f491a48ec740e0214a7dd584eefc2137d] Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
git bisect good 2db39a2f491a48ec740e0214a7dd584eefc2137d
# bad: [f353078f028fbfe9acd4b747b4a19c69ef6846cd] Merge branch 'akpm' (patches from Andrew)
git bisect bad f353078f028fbfe9acd4b747b4a19c69ef6846cd
# good: [fa8cbda88db12e632a8987c94b66f5caf25bcec4] x86/purgatory: add missing FORCE to Makefile target
git bisect good fa8cbda88db12e632a8987c94b66f5caf25bcec4
# good: [bb177a732c4369bb58a1fe1df8f552b6f0f7db5f] mm: do not bug_on on incorrect length in __mm_populate()
git bisect good bb177a732c4369bb58a1fe1df8f552b6f0f7db5f
# good: [fe10e398e860955bac4d28ec031b701d358465e4] reiserfs: fix buffer overflow with long warning messages
git bisect good fe10e398e860955bac4d28ec031b701d358465e4
# bad: [e181ae0c5db9544de9c53239eb22bc012ce75033] mm: zero unavailable pages before memmap init
git bisect bad e181ae0c5db9544de9c53239eb22bc012ce75033
# first bad commit: [e181ae0c5db9544de9c53239eb22bc012ce75033] mm: zero unavailable pages before memmap init
-------------------------------------------------------------------------------
We can process 400+ images per day.
If you need any image editing, please let us know.
Photos cut out;
Photos clipping path;
Photos masking;
Photo shadow creation;
Photos retouching;
Beauty Model retouching on skin, face, body;
Glamour retouching;
Products retouching.
We can give you testing for your photos.
Turnaround time is fast
Thanks,
Simon
We can process 400+ images per day.
If you need any image editing, please let us know.
Photos cut out;
Photos clipping path;
Photos masking;
Photo shadow creation;
Photos retouching;
Beauty Model retouching on skin, face, body;
Glamour retouching;
Products retouching.
We can give you testing for your photos.
Turnaround time is fast
Thanks,
Simon
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 03703ed1debf777ea845aa9b50ba2e80a5e7dd3c Mon Sep 17 00:00:00 2001
From: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Fri, 2 Feb 2018 05:08:59 -0500
Subject: [PATCH] media: vb2: core: Finish buffers at the end of the stream
If buffers were prepared or queued and the buffers were released without
starting the queue, the finish mem op (corresponding to the prepare mem
op) was never called to the buffers.
Before commit a136f59c0a1f there was no need to do this as in such a case
the prepare mem op had not been called yet. Address the problem by
explicitly calling finish mem op when the queue is stopped if the buffer
is in either prepared or queued state.
Fixes: a136f59c0a1f ("[media] vb2: Move buffer cache synchronisation to prepare from queue")
Cc: stable(a)vger.kernel.org # for v4.13 and up
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Tested-by: Devin Heitmueller <dheitmueller(a)kernellabs.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index debe35fc66b4..d3f7bb33a54d 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -1696,6 +1696,15 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
for (i = 0; i < q->num_buffers; ++i) {
struct vb2_buffer *vb = q->bufs[i];
+ if (vb->state == VB2_BUF_STATE_PREPARED ||
+ vb->state == VB2_BUF_STATE_QUEUED) {
+ unsigned int plane;
+
+ for (plane = 0; plane < vb->num_planes; ++plane)
+ call_void_memop(vb, finish,
+ vb->planes[plane].mem_priv);
+ }
+
if (vb->state != VB2_BUF_STATE_DEQUEUED) {
vb->state = VB2_BUF_STATE_PREPARED;
call_void_vb_qop(vb, buf_finish, vb);
From: Dewet Thibaut <thibaut.dewet(a)nokia.com>
Previously the min interval limitation (one second) has been introduced.
In this case it was no longer possible to disable the machine check.
This patch removes this limitation and allows the value 0 again.
Cc: stable(a)vger.kernel.org
Fixes: b3b7c4795c ("x86/MCE: Serialize sysfs changes")
Signed-off-by: Dewet Thibaut <thibaut.dewet(a)nokia.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
---
arch/x86/kernel/cpu/mcheck/mce.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c102ad51025e..8c50754c09c1 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2165,9 +2165,6 @@ static ssize_t store_int_with_restart(struct device *s,
if (check_interval == old_check_interval)
return ret;
- if (check_interval < 1)
- check_interval = 1;
-
mutex_lock(&mce_sysfs_mutex);
mce_restart();
mutex_unlock(&mce_sysfs_mutex);
--
2.18.0
This is a note to let you know that I've just added the patch titled
usb: cdc_acm: Add quirk for Castles VEGA3000
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 1445cbe476fc3dd09c0b380b206526a49403c071 Mon Sep 17 00:00:00 2001
From: Lubomir Rintel <lkundrak(a)v3.sk>
Date: Tue, 10 Jul 2018 08:28:49 +0200
Subject: usb: cdc_acm: Add quirk for Castles VEGA3000
The device (a POS terminal) implements CDC ACM, but has not union
descriptor.
Signed-off-by: Lubomir Rintel <lkundrak(a)v3.sk>
Acked-by: Oliver Neukum <oneukum(a)suse.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/class/cdc-acm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 998b32d0167e..75c4623ad779 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -1831,6 +1831,9 @@ static const struct usb_device_id acm_ids[] = {
{ USB_DEVICE(0x09d8, 0x0320), /* Elatec GmbH TWN3 */
.driver_info = NO_UNION_NORMAL, /* has misplaced union descriptor */
},
+ { USB_DEVICE(0x0ca6, 0xa050), /* Castles VEGA3000 */
+ .driver_info = NO_UNION_NORMAL, /* reports zero length descriptor */
+ },
{ USB_DEVICE(0x2912, 0x0001), /* ATOL FPrint */
.driver_info = CLEAR_HALT_CONDITIONS,
--
2.18.0
This is a note to let you know that I've just added the patch titled
eventpoll.h: wrap casts in () properly
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 45cd74cb5061781e793a098c420a7f548fdc9e7d Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Tue, 10 Jul 2018 17:15:38 +0200
Subject: eventpoll.h: wrap casts in () properly
When importing the latest copy of the kernel headers into Bionic,
Christpher and Elliott noticed that the eventpoll.h casts were not
wrapped in (). As it is, clang complains about macros without
surrounding (), so this makes it a pain for userspace tools.
So fix it up by adding another () pair, and make them line up purty by
using tabs.
Fixes: 65aaf87b3aa2 ("add EPOLLNVAL, annotate EPOLL... and event_poll->event")
Reported-by: Christopher Ferris <cferris(a)google.com>
Reported-by: Elliott Hughes <enh(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/uapi/linux/eventpoll.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/eventpoll.h b/include/uapi/linux/eventpoll.h
index bf48e71f2634..8a3432d0f0dc 100644
--- a/include/uapi/linux/eventpoll.h
+++ b/include/uapi/linux/eventpoll.h
@@ -42,7 +42,7 @@
#define EPOLLRDHUP (__force __poll_t)0x00002000
/* Set exclusive wakeup mode for the target file descriptor */
-#define EPOLLEXCLUSIVE (__force __poll_t)(1U << 28)
+#define EPOLLEXCLUSIVE ((__force __poll_t)(1U << 28))
/*
* Request the handling of system wakeup events so as to prevent system suspends
@@ -54,13 +54,13 @@
*
* Requires CAP_BLOCK_SUSPEND
*/
-#define EPOLLWAKEUP (__force __poll_t)(1U << 29)
+#define EPOLLWAKEUP ((__force __poll_t)(1U << 29))
/* Set the One Shot behaviour for the target file descriptor */
-#define EPOLLONESHOT (__force __poll_t)(1U << 30)
+#define EPOLLONESHOT ((__force __poll_t)(1U << 30))
/* Set the Edge Triggered behaviour for the target file descriptor */
-#define EPOLLET (__force __poll_t)(1U << 31)
+#define EPOLLET ((__force __poll_t)(1U << 31))
/*
* On x86-64 make the 64bit structure have the same alignment as the
--
2.18.0
The patch below does not apply to the 4.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 37becec95ac31b209eb1c8e096f1093a7db00f32 Mon Sep 17 00:00:00 2001
From: Omar Sandoval <osandov(a)fb.com>
Date: Mon, 21 May 2018 17:07:19 -0700
Subject: [PATCH] Btrfs: allow empty subvol= again
I got a report that after upgrading to 4.16, someone's filesystems
weren't mounting:
[ 23.845852] BTRFS info (device loop0): unrecognized mount option 'subvol='
Before 4.16, this mounted the default subvolume. It turns out that this
empty "subvol=" is actually an application bug, but it was causing the
application to fail, so it's an ABI break if you squint.
The generic parsing code we use for mount options (match_token())
doesn't match an empty string as "%s". Previously, setup_root_args()
removed the "subvol=" string, but the mount path was cleaned up to not
need that. Add a dummy Opt_subvol_empty to fix this.
The simple workaround is to use / or . for the value of 'subvol=' .
Fixes: 312c89fbca06 ("btrfs: cleanup btrfs_mount() using btrfs_mount_root()")
CC: stable(a)vger.kernel.org # 4.16+
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index c67fafaa2fe7..81107ad49f3a 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -323,6 +323,7 @@ enum {
Opt_ssd, Opt_nossd,
Opt_ssd_spread, Opt_nossd_spread,
Opt_subvol,
+ Opt_subvol_empty,
Opt_subvolid,
Opt_thread_pool,
Opt_treelog, Opt_notreelog,
@@ -388,6 +389,7 @@ static const match_table_t tokens = {
{Opt_ssd_spread, "ssd_spread"},
{Opt_nossd_spread, "nossd_spread"},
{Opt_subvol, "subvol=%s"},
+ {Opt_subvol_empty, "subvol="},
{Opt_subvolid, "subvolid=%s"},
{Opt_thread_pool, "thread_pool=%u"},
{Opt_treelog, "treelog"},
@@ -461,6 +463,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_opt(info->mount_opt, DEGRADED);
break;
case Opt_subvol:
+ case Opt_subvol_empty:
case Opt_subvolid:
case Opt_subvolrootid:
case Opt_device:
The regset API documented in <linux/regset.h> defines -ENODEV as the
result of the `->active' handler to be used where the feature requested
is not available on the hardware found. However code handling core file
note generation in `fill_thread_core_info' interpretes any non-zero
result from the `->active' handler as the regset requested being active.
Consequently processing continues (and hopefully gracefully fails later
on) rather than being abandoned right away for the regset requested.
Fix the problem then by making the code proceed only if a positive
result is returned from the `->active' handler.
Cc: stable(a)vger.kernel.org # 2.6.25+
Fixes: 4206d3aa1978 ("elf core dump: notes user_regset")
Signed-off-by: Maciej W. Rozycki <macro(a)mips.com>
---
Hi,
Overall we could also use the count returned by ->active to limit the
size of data requested, i.e. something along the lines of:
ssize_t size;
if (regset->active)
size = regset->active(t->task, regset);
else
size = regset_size(t->task, regset);
if (size > 0) {
...
however that would be an optimisation that belongs to a separate change,
which (due to the need to unload stuff I have in progress already) I am
not going to make at this point. Perhaps someone else would be willing
to pick up the idea.
Anyway, please apply.
Maciej
---
fs/binfmt_elf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
linux-elf-core-regset-active.diff
Index: linux-jhogan-test/fs/binfmt_elf.c
===================================================================
--- linux-jhogan-test.orig/fs/binfmt_elf.c 2018-03-21 17:14:55.000000000 +0000
+++ linux-jhogan-test/fs/binfmt_elf.c 2018-05-09 23:25:50.742255000 +0100
@@ -1739,7 +1739,7 @@ static int fill_thread_core_info(struct
const struct user_regset *regset = &view->regsets[i];
do_thread_regset_writeback(t->task, regset);
if (regset->core_note_type && regset->get &&
- (!regset->active || regset->active(t->task, regset))) {
+ (!regset->active || regset->active(t->task, regset) > 0)) {
int ret;
size_t size = regset_size(t->task, regset);
void *data = kmalloc(size, GFP_KERNEL);
From: Waldemar Rymarkiewicz <waldemar.rymarkiewicz(a)gmail.com>
Original commit c5c2a97b3ac7 ("PM / OPP: Update voltage in case freq ==
old_freq").
This commit fixes a rare but possible case when the clk rate is updated
without update of the regulator voltage.
At boot up, CPUfreq checks if the system is running at the right freq. This
is a sanity check in case a bootloader set clk rate that is outside of freq
table present with cpufreq core. In such cases system can be unstable so
better to change it to a freq that is preset in freq-table.
The CPUfreq takes next freq that is >= policy->cur and this is our
target_freq that needs to be set now.
dev_pm_opp_set_rate(dev, target_freq) checks the target_freq and the
old_freq (a current rate). If these are equal it returns early. If not,
it searches for OPP (old_opp) that fits best to old_freq (not listed in
the table) and updates old_freq (!).
Here, we can end up with old_freq = old_opp.rate = target_freq, which
is not handled in _generic_set_opp_regulator(). It's supposed to update
voltage only when freq > old_freq || freq > old_freq.
if (freq > old_freq) {
ret = _set_opp_voltage(dev, reg, new_supply);
[...]
if (freq < old_freq) {
ret = _set_opp_voltage(dev, reg, new_supply);
if (ret)
It results in, no voltage update while clk rate is updated.
Example:
freq-table = {
1000MHz 1.15V
666MHZ 1.10V
333MHz 1.05V
}
boot-up-freq = 800MHz # not listed in freq-table
freq = target_freq = 1GHz
old_freq = 800Mhz
old_opp = _find_freq_ceil(opp_table, &old_freq); #(old_freq is modified!)
old_freq = 1GHz
Fixes: 6a0712f6f199 ("PM / OPP: Add dev_pm_opp_set_rate()")
Cc: 4.6+ <stable(a)vger.kernel.org> # v4.6+
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz(a)gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
Sending it for stable kernels from 4.6 until 4.9.
drivers/base/power/opp/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 4c7c6da7a989..0580cbe5bd1c 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -642,7 +642,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
rcu_read_unlock();
/* Scaling up? Scale voltage before frequency */
- if (freq > old_freq) {
+ if (freq >= old_freq) {
ret = _set_opp_voltage(dev, reg, u_volt, u_volt_min,
u_volt_max);
if (ret)
--
2.18.0.rc1.242.g61856ae69a2c
This is a note to let you know that I've just added the patch titled
eventpoll.h: wrap casts in () properly
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 45cd74cb5061781e793a098c420a7f548fdc9e7d Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Tue, 10 Jul 2018 17:15:38 +0200
Subject: eventpoll.h: wrap casts in () properly
When importing the latest copy of the kernel headers into Bionic,
Christpher and Elliott noticed that the eventpoll.h casts were not
wrapped in (). As it is, clang complains about macros without
surrounding (), so this makes it a pain for userspace tools.
So fix it up by adding another () pair, and make them line up purty by
using tabs.
Fixes: 65aaf87b3aa2 ("add EPOLLNVAL, annotate EPOLL... and event_poll->event")
Reported-by: Christopher Ferris <cferris(a)google.com>
Reported-by: Elliott Hughes <enh(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/uapi/linux/eventpoll.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/eventpoll.h b/include/uapi/linux/eventpoll.h
index bf48e71f2634..8a3432d0f0dc 100644
--- a/include/uapi/linux/eventpoll.h
+++ b/include/uapi/linux/eventpoll.h
@@ -42,7 +42,7 @@
#define EPOLLRDHUP (__force __poll_t)0x00002000
/* Set exclusive wakeup mode for the target file descriptor */
-#define EPOLLEXCLUSIVE (__force __poll_t)(1U << 28)
+#define EPOLLEXCLUSIVE ((__force __poll_t)(1U << 28))
/*
* Request the handling of system wakeup events so as to prevent system suspends
@@ -54,13 +54,13 @@
*
* Requires CAP_BLOCK_SUSPEND
*/
-#define EPOLLWAKEUP (__force __poll_t)(1U << 29)
+#define EPOLLWAKEUP ((__force __poll_t)(1U << 29))
/* Set the One Shot behaviour for the target file descriptor */
-#define EPOLLONESHOT (__force __poll_t)(1U << 30)
+#define EPOLLONESHOT ((__force __poll_t)(1U << 30))
/* Set the Edge Triggered behaviour for the target file descriptor */
-#define EPOLLET (__force __poll_t)(1U << 31)
+#define EPOLLET ((__force __poll_t)(1U << 31))
/*
* On x86-64 make the 64bit structure have the same alignment as the
--
2.18.0
Commit 815c6704bf9f1c59f3a6be380a4032b9c57b12f1 upstream.
The controller memory buffer is remapped into a kernel address on each
reset, but the driver was setting the submission queue base address
only on the very first queue creation. The remapped address is likely to
change after a reset, so accessing the old address will hit a kernel bug.
This patch fixes that by setting the queue's CMB base address each time
the queue is created.
Fixes: f63572dff1421 ("nvme: unmap CMB and remove sysfs file in reset path")
Reported-by: Christian Black <christian.d.black(a)intel.com>
Cc: Jon Derrick <jonathan.derrick(a)intel.com>
Signed-off-by: Keith Busch <keith.busch(a)intel.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Scott Bauer <scott.bauer(a)intel.com>
---
drivers/nvme/host/pci.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 3d4724e38aa9..4cac4755abef 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1233,17 +1233,15 @@ static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues,
static int nvme_alloc_sq_cmds(struct nvme_dev *dev, struct nvme_queue *nvmeq,
int qid, int depth)
{
- if (qid && dev->cmb && use_cmb_sqes && NVME_CMB_SQS(dev->cmbsz)) {
- unsigned offset = (qid - 1) * roundup(SQ_SIZE(depth),
- dev->ctrl.page_size);
- nvmeq->sq_dma_addr = dev->cmb_bus_addr + offset;
- nvmeq->sq_cmds_io = dev->cmb + offset;
- } else {
- nvmeq->sq_cmds = dma_alloc_coherent(dev->dev, SQ_SIZE(depth),
- &nvmeq->sq_dma_addr, GFP_KERNEL);
- if (!nvmeq->sq_cmds)
- return -ENOMEM;
- }
+
+ /* CMB SQEs will be mapped before creation */
+ if (qid && dev->cmb && use_cmb_sqes && NVME_CMB_SQS(dev->cmbsz))
+ return 0;
+
+ nvmeq->sq_cmds = dma_alloc_coherent(dev->dev, SQ_SIZE(depth),
+ &nvmeq->sq_dma_addr, GFP_KERNEL);
+ if (!nvmeq->sq_cmds)
+ return -ENOMEM;
return 0;
}
@@ -1320,6 +1318,13 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
struct nvme_dev *dev = nvmeq->dev;
int result;
+ if (qid && dev->cmb && use_cmb_sqes && NVME_CMB_SQS(dev->cmbsz)) {
+ unsigned offset = (qid - 1) * roundup(SQ_SIZE(nvmeq->q_depth),
+ dev->ctrl.page_size);
+ nvmeq->sq_dma_addr = dev->cmb_bus_addr + offset;
+ nvmeq->sq_cmds_io = dev->cmb + offset;
+ }
+
nvmeq->cq_vector = qid - 1;
result = adapter_alloc_cq(dev, qid, nvmeq);
if (result < 0)
--
2.17.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From abe41184abac487264a4904bfcff2d5500dccce6 Mon Sep 17 00:00:00 2001
From: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Date: Tue, 10 Jul 2018 23:42:15 +0200
Subject: [PATCH] i2c: recovery: if possible send STOP with recovery pulses
I2C clients may misunderstand recovery pulses if they can't read SDA to
bail out early. In the worst case, as a write operation. To avoid that
and if we can write SDA, try to send STOP to avoid the
misinterpretation.
Signed-off-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Reviewed-by: Peter Rosin <peda(a)axentia.se>
Signed-off-by: Wolfram Sang <wsa(a)the-dreams.de>
Cc: stable(a)kernel.org
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index 31d16ada6e7d..301285c54603 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -198,7 +198,16 @@ int i2c_generic_scl_recovery(struct i2c_adapter *adap)
val = !val;
bri->set_scl(adap, val);
- ndelay(RECOVERY_NDELAY);
+
+ /*
+ * If we can set SDA, we will always create STOP here to ensure
+ * the additional pulses will do no harm. This is achieved by
+ * letting SDA follow SCL half a cycle later.
+ */
+ ndelay(RECOVERY_NDELAY / 2);
+ if (bri->set_sda)
+ bri->set_sda(adap, val);
+ ndelay(RECOVERY_NDELAY / 2);
}
/* check if recovery actually succeeded */
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b697d7d8c741f27b728a878fc55852b06d0f6f5e Mon Sep 17 00:00:00 2001
From: "Michael J. Ruhl" <michael.j.ruhl(a)intel.com>
Date: Wed, 20 Jun 2018 09:29:08 -0700
Subject: [PATCH] IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return
values
The __get_txreq() function can return a pointer, ERR_PTR(-EBUSY), or NULL.
All of the relevant call sites look for IS_ERR, so the NULL return would
lead to a NULL pointer exception.
Do not use the ERR_PTR mechanism for this function.
Update all call sites to handle the return value correctly.
Clean up error paths to reflect return value.
Fixes: 45842abbb292 ("staging/rdma/hfi1: move txreq header code")
Cc: <stable(a)vger.kernel.org> # 4.9.x+
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Reviewed-by: Kamenee Arumugam <kamenee.arumugam(a)intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index 1a1a47ac53c6..f15c93102081 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -271,7 +271,7 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
lockdep_assert_held(&qp->s_lock);
ps->s_txreq = get_txreq(ps->dev, qp);
- if (IS_ERR(ps->s_txreq))
+ if (!ps->s_txreq)
goto bail_no_tx;
if (priv->hdr_type == HFI1_PKT_TYPE_9B) {
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index b7b671017e59..e254dcec6f64 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@@ -72,7 +72,7 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
int middle = 0;
ps->s_txreq = get_txreq(ps->dev, qp);
- if (IS_ERR(ps->s_txreq))
+ if (!ps->s_txreq)
goto bail_no_tx;
if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_SEND_OK)) {
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index 1ab332f1866e..70d39fc450a1 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@@ -503,7 +503,7 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
u32 lid;
ps->s_txreq = get_txreq(ps->dev, qp);
- if (IS_ERR(ps->s_txreq))
+ if (!ps->s_txreq)
goto bail_no_tx;
if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_NEXT_SEND_OK)) {
diff --git a/drivers/infiniband/hw/hfi1/verbs_txreq.c b/drivers/infiniband/hw/hfi1/verbs_txreq.c
index 873e48ea923f..c4ab2d5b4502 100644
--- a/drivers/infiniband/hw/hfi1/verbs_txreq.c
+++ b/drivers/infiniband/hw/hfi1/verbs_txreq.c
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2016 - 2017 Intel Corporation.
+ * Copyright(c) 2016 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@@ -94,7 +94,7 @@ struct verbs_txreq *__get_txreq(struct hfi1_ibdev *dev,
struct rvt_qp *qp)
__must_hold(&qp->s_lock)
{
- struct verbs_txreq *tx = ERR_PTR(-EBUSY);
+ struct verbs_txreq *tx = NULL;
write_seqlock(&dev->txwait_lock);
if (ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK) {
diff --git a/drivers/infiniband/hw/hfi1/verbs_txreq.h b/drivers/infiniband/hw/hfi1/verbs_txreq.h
index 729244c3086c..1c19bbc764b2 100644
--- a/drivers/infiniband/hw/hfi1/verbs_txreq.h
+++ b/drivers/infiniband/hw/hfi1/verbs_txreq.h
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016 - 2018 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@@ -83,7 +83,7 @@ static inline struct verbs_txreq *get_txreq(struct hfi1_ibdev *dev,
if (unlikely(!tx)) {
/* call slow path to get the lock */
tx = __get_txreq(dev, qp);
- if (IS_ERR(tx))
+ if (!tx)
return tx;
}
tx->qp = qp;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe48aecb4df837540f13b5216f27ddb306aaf4b9 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 1 Jul 2018 15:31:54 +0300
Subject: [PATCH] RDMA/uverbs: Don't fail in creation of multiple flows
The conversion from offsetof() calculations to sizeof()
wrongly behaved for missed exact size and in scenario with
more than one flow.
In such scenario we got "create flow failed, flow 10: 8 bytes
left from uverb cmd" error, which is wrong because the size of
kern_spec is exactly 8 bytes, and we were not supposed to fail.
Cc: <stable(a)vger.kernel.org> # 3.12
Fixes: 4fae7f170416 ("RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow")
Reported-by: Ran Rozenstein <ranro(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 87ffeebc0b28..cc06e8404e9b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3586,7 +3586,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
kern_spec = kern_flow_attr->flow_specs;
ib_spec = flow_attr + 1;
for (i = 0; i < flow_attr->num_of_specs &&
- cmd.flow_attr.size > sizeof(*kern_spec) &&
+ cmd.flow_attr.size >= sizeof(*kern_spec) &&
cmd.flow_attr.size >= kern_spec->size;
i++) {
err = kern_spec_to_ib_spec(
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe48aecb4df837540f13b5216f27ddb306aaf4b9 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 1 Jul 2018 15:31:54 +0300
Subject: [PATCH] RDMA/uverbs: Don't fail in creation of multiple flows
The conversion from offsetof() calculations to sizeof()
wrongly behaved for missed exact size and in scenario with
more than one flow.
In such scenario we got "create flow failed, flow 10: 8 bytes
left from uverb cmd" error, which is wrong because the size of
kern_spec is exactly 8 bytes, and we were not supposed to fail.
Cc: <stable(a)vger.kernel.org> # 3.12
Fixes: 4fae7f170416 ("RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow")
Reported-by: Ran Rozenstein <ranro(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 87ffeebc0b28..cc06e8404e9b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3586,7 +3586,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
kern_spec = kern_flow_attr->flow_specs;
ib_spec = flow_attr + 1;
for (i = 0; i < flow_attr->num_of_specs &&
- cmd.flow_attr.size > sizeof(*kern_spec) &&
+ cmd.flow_attr.size >= sizeof(*kern_spec) &&
cmd.flow_attr.size >= kern_spec->size;
i++) {
err = kern_spec_to_ib_spec(
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe48aecb4df837540f13b5216f27ddb306aaf4b9 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 1 Jul 2018 15:31:54 +0300
Subject: [PATCH] RDMA/uverbs: Don't fail in creation of multiple flows
The conversion from offsetof() calculations to sizeof()
wrongly behaved for missed exact size and in scenario with
more than one flow.
In such scenario we got "create flow failed, flow 10: 8 bytes
left from uverb cmd" error, which is wrong because the size of
kern_spec is exactly 8 bytes, and we were not supposed to fail.
Cc: <stable(a)vger.kernel.org> # 3.12
Fixes: 4fae7f170416 ("RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow")
Reported-by: Ran Rozenstein <ranro(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 87ffeebc0b28..cc06e8404e9b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3586,7 +3586,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
kern_spec = kern_flow_attr->flow_specs;
ib_spec = flow_attr + 1;
for (i = 0; i < flow_attr->num_of_specs &&
- cmd.flow_attr.size > sizeof(*kern_spec) &&
+ cmd.flow_attr.size >= sizeof(*kern_spec) &&
cmd.flow_attr.size >= kern_spec->size;
i++) {
err = kern_spec_to_ib_spec(
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe48aecb4df837540f13b5216f27ddb306aaf4b9 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 1 Jul 2018 15:31:54 +0300
Subject: [PATCH] RDMA/uverbs: Don't fail in creation of multiple flows
The conversion from offsetof() calculations to sizeof()
wrongly behaved for missed exact size and in scenario with
more than one flow.
In such scenario we got "create flow failed, flow 10: 8 bytes
left from uverb cmd" error, which is wrong because the size of
kern_spec is exactly 8 bytes, and we were not supposed to fail.
Cc: <stable(a)vger.kernel.org> # 3.12
Fixes: 4fae7f170416 ("RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow")
Reported-by: Ran Rozenstein <ranro(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 87ffeebc0b28..cc06e8404e9b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3586,7 +3586,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
kern_spec = kern_flow_attr->flow_specs;
ib_spec = flow_attr + 1;
for (i = 0; i < flow_attr->num_of_specs &&
- cmd.flow_attr.size > sizeof(*kern_spec) &&
+ cmd.flow_attr.size >= sizeof(*kern_spec) &&
cmd.flow_attr.size >= kern_spec->size;
i++) {
err = kern_spec_to_ib_spec(
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 940efcc8889f0d15567eb07fc9fd69b06e366aa5 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 24 Jun 2018 11:23:42 +0300
Subject: [PATCH] RDMA/uverbs: Protect from attempts to create flows on
unsupported QP
Flows can be created on UD and RAW_PACKET QP types. Attempts to provide
other QP types as an input causes to various unpredictable failures.
The reason is that in order to support all various types (e.g. XRC), we
are supposed to use real_qp handle and not qp handle and expect to
driver/FW to fail such (XRC) flows. The simpler and safer variant is to
ban all QP types except UD and RAW_PACKET, instead of relying on
driver/FW.
Cc: <stable(a)vger.kernel.org> # 3.11
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Cc: syzkaller <syzkaller(a)googlegroups.com>
Reported-by: Noa Osherovich <noaos(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3e90b6a1d9d2..89c4ce2da78b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3559,6 +3559,11 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
goto err_uobj;
}
+ if (qp->qp_type != IB_QPT_UD && qp->qp_type != IB_QPT_RAW_PACKET) {
+ err = -EINVAL;
+ goto err_put;
+ }
+
flow_attr = kzalloc(struct_size(flow_attr, flows,
cmd.flow_attr.num_of_specs), GFP_KERNEL);
if (!flow_attr) {
The patch below does not apply to the 4.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 940efcc8889f0d15567eb07fc9fd69b06e366aa5 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 24 Jun 2018 11:23:42 +0300
Subject: [PATCH] RDMA/uverbs: Protect from attempts to create flows on
unsupported QP
Flows can be created on UD and RAW_PACKET QP types. Attempts to provide
other QP types as an input causes to various unpredictable failures.
The reason is that in order to support all various types (e.g. XRC), we
are supposed to use real_qp handle and not qp handle and expect to
driver/FW to fail such (XRC) flows. The simpler and safer variant is to
ban all QP types except UD and RAW_PACKET, instead of relying on
driver/FW.
Cc: <stable(a)vger.kernel.org> # 3.11
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Cc: syzkaller <syzkaller(a)googlegroups.com>
Reported-by: Noa Osherovich <noaos(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3e90b6a1d9d2..89c4ce2da78b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3559,6 +3559,11 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
goto err_uobj;
}
+ if (qp->qp_type != IB_QPT_UD && qp->qp_type != IB_QPT_RAW_PACKET) {
+ err = -EINVAL;
+ goto err_put;
+ }
+
flow_attr = kzalloc(struct_size(flow_attr, flows,
cmd.flow_attr.num_of_specs), GFP_KERNEL);
if (!flow_attr) {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 940efcc8889f0d15567eb07fc9fd69b06e366aa5 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 24 Jun 2018 11:23:42 +0300
Subject: [PATCH] RDMA/uverbs: Protect from attempts to create flows on
unsupported QP
Flows can be created on UD and RAW_PACKET QP types. Attempts to provide
other QP types as an input causes to various unpredictable failures.
The reason is that in order to support all various types (e.g. XRC), we
are supposed to use real_qp handle and not qp handle and expect to
driver/FW to fail such (XRC) flows. The simpler and safer variant is to
ban all QP types except UD and RAW_PACKET, instead of relying on
driver/FW.
Cc: <stable(a)vger.kernel.org> # 3.11
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Cc: syzkaller <syzkaller(a)googlegroups.com>
Reported-by: Noa Osherovich <noaos(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3e90b6a1d9d2..89c4ce2da78b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3559,6 +3559,11 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
goto err_uobj;
}
+ if (qp->qp_type != IB_QPT_UD && qp->qp_type != IB_QPT_RAW_PACKET) {
+ err = -EINVAL;
+ goto err_put;
+ }
+
flow_attr = kzalloc(struct_size(flow_attr, flows,
cmd.flow_attr.num_of_specs), GFP_KERNEL);
if (!flow_attr) {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 940efcc8889f0d15567eb07fc9fd69b06e366aa5 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 24 Jun 2018 11:23:42 +0300
Subject: [PATCH] RDMA/uverbs: Protect from attempts to create flows on
unsupported QP
Flows can be created on UD and RAW_PACKET QP types. Attempts to provide
other QP types as an input causes to various unpredictable failures.
The reason is that in order to support all various types (e.g. XRC), we
are supposed to use real_qp handle and not qp handle and expect to
driver/FW to fail such (XRC) flows. The simpler and safer variant is to
ban all QP types except UD and RAW_PACKET, instead of relying on
driver/FW.
Cc: <stable(a)vger.kernel.org> # 3.11
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Cc: syzkaller <syzkaller(a)googlegroups.com>
Reported-by: Noa Osherovich <noaos(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3e90b6a1d9d2..89c4ce2da78b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3559,6 +3559,11 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
goto err_uobj;
}
+ if (qp->qp_type != IB_QPT_UD && qp->qp_type != IB_QPT_RAW_PACKET) {
+ err = -EINVAL;
+ goto err_put;
+ }
+
flow_attr = kzalloc(struct_size(flow_attr, flows,
cmd.flow_attr.num_of_specs), GFP_KERNEL);
if (!flow_attr) {
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 940efcc8889f0d15567eb07fc9fd69b06e366aa5 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro(a)mellanox.com>
Date: Sun, 24 Jun 2018 11:23:42 +0300
Subject: [PATCH] RDMA/uverbs: Protect from attempts to create flows on
unsupported QP
Flows can be created on UD and RAW_PACKET QP types. Attempts to provide
other QP types as an input causes to various unpredictable failures.
The reason is that in order to support all various types (e.g. XRC), we
are supposed to use real_qp handle and not qp handle and expect to
driver/FW to fail such (XRC) flows. The simpler and safer variant is to
ban all QP types except UD and RAW_PACKET, instead of relying on
driver/FW.
Cc: <stable(a)vger.kernel.org> # 3.11
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Cc: syzkaller <syzkaller(a)googlegroups.com>
Reported-by: Noa Osherovich <noaos(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3e90b6a1d9d2..89c4ce2da78b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3559,6 +3559,11 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
goto err_uobj;
}
+ if (qp->qp_type != IB_QPT_UD && qp->qp_type != IB_QPT_RAW_PACKET) {
+ err = -EINVAL;
+ goto err_put;
+ }
+
flow_attr = kzalloc(struct_size(flow_attr, flows,
cmd.flow_attr.num_of_specs), GFP_KERNEL);
if (!flow_attr) {
Just a note that I just pushed out commit e181ae0c5db9 ("mm: zero
unavailable pages before memmap init") that fixes the nasty VM crash
issue I had, and that apparently the Fedora people have also seen in
the wild in 4.17.4.
The commit that triggered this didn't make it in until 4.18-rc3, but
it has apparently already been back-ported to stable, so now the fix
needs to be back-ported too.
Linus
Fedora has integrated the jitter entropy daemon to work around slow
boot problems, especially on VM's that don't support virtio-rng:
https://bugzilla.redhat.com/show_bug.cgi?id=1572944
It's understandable why they did this, but the Jitter entropy daemon
works fundamentally on the principle: "the CPU microarchitecture is
**so** complicated and we can't figure it out, so it *must* be
random". Yes, it uses statistical tests to "prove" it is secure, but
AES_ENCRYPT(NSA_KEY, COUNTER++) will also pass statistical tests with
flying colors.
So if RDRAND is available, mix it into entropy submitted from
userspace. It can't hurt, and if you believe the NSA has backdoored
RDRAND, then they probably have enough details about the Intel
microarchitecture that they can reverse engineer how the Jitter
entropy daemon affects the microarchitecture, and attack its output
stream. And if RDRAND is in fact an honest DRNG, it will immeasurably
improve on what the Jitter entropy daemon might produce.
This also provides some protection against someone who is able to read
or set the entropy seed file.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
---
drivers/char/random.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0706646b018d..283fe390e878 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1896,14 +1896,22 @@ static int
write_pool(struct entropy_store *r, const char __user *buffer, size_t count)
{
size_t bytes;
- __u32 buf[16];
+ __u32 t, buf[16];
const char __user *p = buffer;
while (count > 0) {
+ int b, i = 0;
+
bytes = min(count, sizeof(buf));
if (copy_from_user(&buf, p, bytes))
return -EFAULT;
+ for (b = bytes ; b > 0 ; b -= sizeof(__u32), i++) {
+ if (arch_get_random_int(&t))
+ continue;
+ buf[i] ^= t;
+ }
+
count -= bytes;
p += bytes;
--
2.18.0.rc0