This is a note to let you know that I've just added the patch titled
rtmutex: Fix PI chain order integrity
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rtmutex-fix-pi-chain-order-integrity.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Thu, 23 Mar 2017 15:56:13 +0100
Subject: rtmutex: Fix PI chain order integrity
From: Peter Zijlstra <peterz(a)infradead.org>
[ Upstream commit e0aad5b44ff5d28ac1d6ae70cdf84ca228e889dc ]
rt_mutex_waiter::prio is a copy of task_struct::prio which is updated
during the PI chain walk, such that the PI chain order isn't messed up
by (asynchronous) task state updates.
Currently rt_mutex_waiter_less() uses task state for deadline tasks;
this is broken, since the task state can, as said above, change
asynchronously, causing the RB tree order to change without actual
tree update -> FAIL.
Fix this by also copying the deadline into the rt_mutex_waiter state
and updating it along with its prio field.
Ideally we would also force PI chain updates whenever DL tasks update
their deadline parameter, but for first approximation this is less
broken than it was.
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: juri.lelli(a)arm.com
Cc: bigeasy(a)linutronix.de
Cc: xlpang(a)redhat.com
Cc: rostedt(a)goodmis.org
Cc: mathieu.desnoyers(a)efficios.com
Cc: jdesfossez(a)efficios.com
Cc: bristot(a)redhat.com
Link: http://lkml.kernel.org/r/20170323150216.403992539@infradead.org
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/locking/rtmutex.c | 29 +++++++++++++++++++++++++++--
kernel/locking/rtmutex_common.h | 1 +
2 files changed, 28 insertions(+), 2 deletions(-)
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -236,8 +236,7 @@ rt_mutex_waiter_less(struct rt_mutex_wai
* then right waiter has a dl_prio() too.
*/
if (dl_prio(left->prio))
- return dl_time_before(left->task->dl.deadline,
- right->task->dl.deadline);
+ return dl_time_before(left->deadline, right->deadline);
return 0;
}
@@ -704,7 +703,26 @@ static int rt_mutex_adjust_prio_chain(st
/* [7] Requeue the waiter in the lock waiter tree. */
rt_mutex_dequeue(lock, waiter);
+
+ /*
+ * Update the waiter prio fields now that we're dequeued.
+ *
+ * These values can have changed through either:
+ *
+ * sys_sched_set_scheduler() / sys_sched_setattr()
+ *
+ * or
+ *
+ * DL CBS enforcement advancing the effective deadline.
+ *
+ * Even though pi_waiters also uses these fields, and that tree is only
+ * updated in [11], we can do this here, since we hold [L], which
+ * serializes all pi_waiters access and rb_erase() does not care about
+ * the values of the node being removed.
+ */
waiter->prio = task->prio;
+ waiter->deadline = task->dl.deadline;
+
rt_mutex_enqueue(lock, waiter);
/* [8] Release the task */
@@ -831,6 +849,8 @@ static int rt_mutex_adjust_prio_chain(st
static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task,
struct rt_mutex_waiter *waiter)
{
+ lockdep_assert_held(&lock->wait_lock);
+
/*
* Before testing whether we can acquire @lock, we set the
* RT_MUTEX_HAS_WAITERS bit in @lock->owner. This forces all
@@ -958,6 +978,8 @@ static int task_blocks_on_rt_mutex(struc
struct rt_mutex *next_lock;
int chain_walk = 0, res;
+ lockdep_assert_held(&lock->wait_lock);
+
/*
* Early deadlock detection. We really don't want the task to
* enqueue on itself just to untangle the mess later. It's not
@@ -975,6 +997,7 @@ static int task_blocks_on_rt_mutex(struc
waiter->task = task;
waiter->lock = lock;
waiter->prio = task->prio;
+ waiter->deadline = task->dl.deadline;
/* Get the top priority waiter on the lock */
if (rt_mutex_has_waiters(lock))
@@ -1080,6 +1103,8 @@ static void remove_waiter(struct rt_mute
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex *next_lock;
+ lockdep_assert_held(&lock->wait_lock);
+
raw_spin_lock(¤t->pi_lock);
rt_mutex_dequeue(lock, waiter);
current->pi_blocked_on = NULL;
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -33,6 +33,7 @@ struct rt_mutex_waiter {
struct rt_mutex *deadlock_lock;
#endif
int prio;
+ u64 deadline;
};
/*
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.9/sched-stop-resched_cpu-from-sending-ipis-to-offline-cpus.patch
queue-4.9/perf-buildid-do-not-assume-that-readlink-returns-a-null-terminated-string.patch
queue-4.9/printk-correctly-handle-preemption-in-console_unlock.patch
queue-4.9/sched-stop-switched_to_rt-from-sending-ipis-to-offline-cpus.patch
queue-4.9/perf-session-don-t-rely-on-evlist-in-pipe-mode.patch
queue-4.9/rtmutex-fix-pi-chain-order-integrity.patch
queue-4.9/perf-tools-make-perf_event__synthesize_mmap_events-scale.patch
queue-4.9/perf-annotate-fix-a-bug-following-symbolic-link-of-a-build-id-file.patch
queue-4.9/mm-fix-false-positive-vm_bug_on-in-page_cache_-get-add-_speculative.patch
queue-4.9/kprobes-x86-set-kprobes-pages-read-only.patch
queue-4.9/perf-stat-issue-a-hw-watchdog-disable-hint.patch
queue-4.9/kprobes-x86-fix-kprobe-booster-not-to-boost-far-call-instructions.patch
queue-4.9/perf-sort-fix-segfault-with-basic-block-cycles-sort-dimension.patch
queue-4.9/perf-probe-fix-concat_probe_trace_events.patch
queue-4.9/x86-boot-32-defer-resyncing-initial_page_table-until-per-cpu-is-set-up.patch
queue-4.9/perf-inject-copy-events-when-reordering-events-in-pipe-mode.patch
queue-4.9/perf-probe-return-errno-when-not-hitting-any-event.patch
queue-4.9/perf-evsel-return-exact-sub-event-which-failed-with-eperm-for-wildcards.patch
queue-4.9/perf-stat-fix-bug-in-handling-events-in-error-state.patch
This is a note to let you know that I've just added the patch titled
reiserfs: Make cancel_old_flush() reliable
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-make-cancel_old_flush-reliable.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Jan Kara <jack(a)suse.cz>
Date: Wed, 5 Apr 2017 14:09:48 +0200
Subject: reiserfs: Make cancel_old_flush() reliable
From: Jan Kara <jack(a)suse.cz>
[ Upstream commit 71b0576bdb862e964a82c73327cdd1a249c53e67 ]
Currently canceling of delayed work that flushes old data using
cancel_old_flush() does not prevent work from being requeued. Thus
in theory new work can be queued after cancel_old_flush() from
reiserfs_freeze() has run. This will become larger problem once
flush_old_commits() can requeue the work itself.
Fix the problem by recording in sbi->work_queue that flushing work is
canceled and should not be requeued.
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/journal.c | 2 +-
fs/reiserfs/reiserfs.h | 1 +
fs/reiserfs/super.c | 21 +++++++++++++++------
3 files changed, 17 insertions(+), 7 deletions(-)
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -1959,7 +1959,7 @@ static int do_journal_release(struct rei
* will be requeued because superblock is being shutdown and doesn't
* have MS_ACTIVE set.
*/
- cancel_delayed_work_sync(&REISERFS_SB(sb)->old_work);
+ reiserfs_cancel_old_flush(sb);
/* wait for all commits to finish */
cancel_delayed_work_sync(&SB_JOURNAL(sb)->j_work);
--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -2948,6 +2948,7 @@ int reiserfs_allocate_list_bitmaps(struc
struct reiserfs_list_bitmap *, unsigned int);
void reiserfs_schedule_old_flush(struct super_block *s);
+void reiserfs_cancel_old_flush(struct super_block *s);
void add_save_link(struct reiserfs_transaction_handle *th,
struct inode *inode, int truncate);
int remove_save_link(struct inode *inode, int truncate);
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -90,7 +90,9 @@ static void flush_old_commits(struct wor
s = sbi->s_journal->j_work_sb;
spin_lock(&sbi->old_work_lock);
- sbi->work_queued = 0;
+ /* Avoid clobbering the cancel state... */
+ if (sbi->work_queued == 1)
+ sbi->work_queued = 0;
spin_unlock(&sbi->old_work_lock);
reiserfs_sync_fs(s, 1);
@@ -117,21 +119,22 @@ void reiserfs_schedule_old_flush(struct
spin_unlock(&sbi->old_work_lock);
}
-static void cancel_old_flush(struct super_block *s)
+void reiserfs_cancel_old_flush(struct super_block *s)
{
struct reiserfs_sb_info *sbi = REISERFS_SB(s);
- cancel_delayed_work_sync(&REISERFS_SB(s)->old_work);
spin_lock(&sbi->old_work_lock);
- sbi->work_queued = 0;
+ /* Make sure no new flushes will be queued */
+ sbi->work_queued = 2;
spin_unlock(&sbi->old_work_lock);
+ cancel_delayed_work_sync(&REISERFS_SB(s)->old_work);
}
static int reiserfs_freeze(struct super_block *s)
{
struct reiserfs_transaction_handle th;
- cancel_old_flush(s);
+ reiserfs_cancel_old_flush(s);
reiserfs_write_lock(s);
if (!(s->s_flags & MS_RDONLY)) {
@@ -152,7 +155,13 @@ static int reiserfs_freeze(struct super_
static int reiserfs_unfreeze(struct super_block *s)
{
+ struct reiserfs_sb_info *sbi = REISERFS_SB(s);
+
reiserfs_allow_writes(s);
+ spin_lock(&sbi->old_work_lock);
+ /* Allow old_work to run again */
+ sbi->work_queued = 0;
+ spin_unlock(&sbi->old_work_lock);
return 0;
}
@@ -2194,7 +2203,7 @@ error_unlocked:
if (sbi->commit_wq)
destroy_workqueue(sbi->commit_wq);
- cancel_delayed_work_sync(&REISERFS_SB(s)->old_work);
+ reiserfs_cancel_old_flush(s);
reiserfs_free_bitmap_cache(s);
if (SB_BUFFER_WITH_SB(s))
Patches currently in stable-queue which might be from jack(a)suse.cz are
queue-4.9/reiserfs-make-cancel_old_flush-reliable.patch
This is a note to let you know that I've just added the patch titled
regulator: core: Limit propagation of parent voltage count and list
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
regulator-core-limit-propagation-of-parent-voltage-count-and-list.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Matthias Kaehlcke <mka(a)chromium.org>
Date: Mon, 27 Mar 2017 16:54:12 -0700
Subject: regulator: core: Limit propagation of parent voltage count and list
From: Matthias Kaehlcke <mka(a)chromium.org>
[ Upstream commit fd086045559d90cd7854818b4c60a7119eda6231 ]
Commit 26988efe11b1 ("regulator: core: Allow to get voltage count and
list from parent") introduces the propagation of the parent voltage
count and list for regulators that don't provide this information
themselves. The goal is to support simple switch regulators, however as
a side effect normal continuous regulators can leak details of their
supplies and provide consumers with inconsistent information.
Limit the propagation of the voltage count and list to switch
regulators.
Fixes: 26988efe11b1 ("regulator: core: Allow to get voltage count and
list from parent")
Signed-off-by: Matthias Kaehlcke <mka(a)chromium.org>
Reviewed-by: Javier Martinez Canillas <javier(a)osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier(a)osg.samsung.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/regulator/core.c | 9 +++++++--
include/linux/regulator/driver.h | 2 ++
2 files changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -2465,7 +2465,7 @@ static int _regulator_list_voltage(struc
ret = ops->list_voltage(rdev, selector);
if (lock)
mutex_unlock(&rdev->mutex);
- } else if (rdev->supply) {
+ } else if (rdev->is_switch && rdev->supply) {
ret = _regulator_list_voltage(rdev->supply, selector, lock);
} else {
return -EINVAL;
@@ -2523,7 +2523,7 @@ int regulator_count_voltages(struct regu
if (rdev->desc->n_voltages)
return rdev->desc->n_voltages;
- if (!rdev->supply)
+ if (!rdev->is_switch || !rdev->supply)
return -EINVAL;
return regulator_count_voltages(rdev->supply);
@@ -4049,6 +4049,11 @@ regulator_register(const struct regulato
mutex_unlock(®ulator_list_mutex);
}
+ if (!rdev->desc->ops->get_voltage &&
+ !rdev->desc->ops->list_voltage &&
+ !rdev->desc->fixed_uV)
+ rdev->is_switch = true;
+
ret = device_register(&rdev->dev);
if (ret != 0) {
put_device(&rdev->dev);
--- a/include/linux/regulator/driver.h
+++ b/include/linux/regulator/driver.h
@@ -425,6 +425,8 @@ struct regulator_dev {
struct regulator_enable_gpio *ena_pin;
unsigned int ena_gpio_state:1;
+ unsigned int is_switch:1;
+
/* time when this regulator was disabled last time */
unsigned long last_off_jiffy;
};
Patches currently in stable-queue which might be from mka(a)chromium.org are
queue-4.9/regulator-core-limit-propagation-of-parent-voltage-count-and-list.patch
This is a note to let you know that I've just added the patch titled
rcutorture/configinit: Fix build directory error message
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rcutorture-configinit-fix-build-directory-error-message.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: SeongJae Park <sj38.park(a)gmail.com>
Date: Fri, 3 Nov 2017 19:17:20 +0900
Subject: rcutorture/configinit: Fix build directory error message
From: SeongJae Park <sj38.park(a)gmail.com>
[ Upstream commit 2adfa4210f8f35cdfb4e08318cc06b99752964c2 ]
The 'configinit.sh' script checks the format of optional argument for the
build directory, printing an error message if the format is not valid.
However, the error message uses the wrong variable, indicating an empty
string even though the user entered a non-empty (but erroneous) string.
This commit fixes the script to use the correct variable.
Fixes: c87b9c601ac8 ("rcutorture: Add KVM-based test framework")
Signed-off-by: SeongJae Park <sj38.park(a)gmail.com>
Signed-off-by: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/testing/selftests/rcutorture/bin/configinit.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/rcutorture/bin/configinit.sh
+++ b/tools/testing/selftests/rcutorture/bin/configinit.sh
@@ -51,7 +51,7 @@ then
mkdir $builddir
fi
else
- echo Bad build directory: \"$builddir\"
+ echo Bad build directory: \"$buildloc\"
exit 2
fi
fi
Patches currently in stable-queue which might be from sj38.park(a)gmail.com are
queue-4.9/rcutorture-configinit-fix-build-directory-error-message.patch
This is a note to let you know that I've just added the patch titled
qed: Fix TM block ILT allocation
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
qed-fix-tm-block-ilt-allocation.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Michal Kalderon <Michal.Kalderon(a)cavium.com>
Date: Mon, 3 Apr 2017 12:21:10 +0300
Subject: qed: Fix TM block ILT allocation
From: Michal Kalderon <Michal.Kalderon(a)cavium.com>
[ Upstream commit 44531ba45dbf3c23cc7ae0934ec9b33ef340ac56 ]
When configuring the HW timers block we should set the number of CIDs
up until the last CID that require timers, instead of only those CIDs
whose protocol needs timers support.
Today, the protocols that require HW timers' support have their CIDs
before any other protocol, but that would change in future [when we
add iWARP support].
Signed-off-by: Michal Kalderon <Michal.Kalderon(a)cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz(a)cavium.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/qlogic/qed/qed_cxt.c | 32 +++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/qlogic/qed/qed_cxt.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.c
@@ -271,16 +271,34 @@ struct qed_tm_iids {
u32 per_vf_tids;
};
-static void qed_cxt_tm_iids(struct qed_cxt_mngr *p_mngr,
+static void qed_cxt_tm_iids(struct qed_hwfn *p_hwfn,
+ struct qed_cxt_mngr *p_mngr,
struct qed_tm_iids *iids)
{
- u32 i, j;
-
- for (i = 0; i < MAX_CONN_TYPES; i++) {
+ bool tm_vf_required = false;
+ bool tm_required = false;
+ int i, j;
+
+ /* Timers is a special case -> we don't count how many cids require
+ * timers but what's the max cid that will be used by the timer block.
+ * therefore we traverse in reverse order, and once we hit a protocol
+ * that requires the timers memory, we'll sum all the protocols up
+ * to that one.
+ */
+ for (i = MAX_CONN_TYPES - 1; i >= 0; i--) {
struct qed_conn_type_cfg *p_cfg = &p_mngr->conn_cfg[i];
- if (tm_cid_proto(i)) {
+ if (tm_cid_proto(i) || tm_required) {
+ if (p_cfg->cid_count)
+ tm_required = true;
+
iids->pf_cids += p_cfg->cid_count;
+ }
+
+ if (tm_cid_proto(i) || tm_vf_required) {
+ if (p_cfg->cids_per_vf)
+ tm_vf_required = true;
+
iids->per_vf_cids += p_cfg->cids_per_vf;
}
}
@@ -696,7 +714,7 @@ int qed_cxt_cfg_ilt_compute(struct qed_h
/* TM PF */
p_cli = &p_mngr->clients[ILT_CLI_TM];
- qed_cxt_tm_iids(p_mngr, &tm_iids);
+ qed_cxt_tm_iids(p_hwfn, p_mngr, &tm_iids);
total = tm_iids.pf_cids + tm_iids.pf_tids_total;
if (total) {
p_blk = &p_cli->pf_blks[0];
@@ -1591,7 +1609,7 @@ static void qed_tm_init_pf(struct qed_hw
u8 i;
memset(&tm_iids, 0, sizeof(tm_iids));
- qed_cxt_tm_iids(p_mngr, &tm_iids);
+ qed_cxt_tm_iids(p_hwfn, p_mngr, &tm_iids);
/* @@@TBD No pre-scan for now */
Patches currently in stable-queue which might be from Michal.Kalderon(a)cavium.com are
queue-4.9/qed-fix-tm-block-ilt-allocation.patch
This is a note to let you know that I've just added the patch titled
qed: Correct MSI-x for storage
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
qed-correct-msi-x-for-storage.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: "Mintz, Yuval" <Yuval.Mintz(a)cavium.com>
Date: Wed, 5 Apr 2017 21:20:11 +0300
Subject: qed: Correct MSI-x for storage
From: "Mintz, Yuval" <Yuval.Mintz(a)cavium.com>
[ Upstream commit 2f78227874754b1e10cd348fd6e7693b0dabb3f6 ]
When qedr is enabled, qed would try dividing the msi-x vectors between
L2 and RoCE, starting with L2 and providing it with sufficient vectors
for its queues.
Problem is qed would also do that for storage partitions, and as those
don't need queues it would lead qed to award those partitions with 0
msi-x vectors, causing them to believe theye're using INTa and
preventing them from operating.
Fixes: 51ff17251c9c ("qed: Add support for RoCE hw init")
Signed-off-by: Yuval Mintz <Yuval.Mintz(a)cavium.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/qlogic/qed/qed_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -711,7 +711,8 @@ static int qed_slowpath_setup_int(struct
cdev->int_params.fp_msix_cnt = cdev->int_params.out.num_vectors -
cdev->num_hwfns;
- if (!IS_ENABLED(CONFIG_QED_RDMA))
+ if (!IS_ENABLED(CONFIG_QED_RDMA) ||
+ QED_LEADING_HWFN(cdev)->hw_info.personality != QED_PCI_ETH_ROCE)
return 0;
for_each_hwfn(cdev, i)
Patches currently in stable-queue which might be from Yuval.Mintz(a)cavium.com are
queue-4.9/qed-always-publish-vf-link-from-leading-hwfn.patch
queue-4.9/qed-fix-tm-block-ilt-allocation.patch
queue-4.9/qed-correct-msi-x-for-storage.patch
This is a note to let you know that I've just added the patch titled
qed: Always publish VF link from leading hwfn
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
qed-always-publish-vf-link-from-leading-hwfn.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: "Mintz, Yuval" <Yuval.Mintz(a)cavium.com>
Date: Sun, 19 Mar 2017 13:08:20 +0200
Subject: qed: Always publish VF link from leading hwfn
From: "Mintz, Yuval" <Yuval.Mintz(a)cavium.com>
[ Upstream commit e50728effe1126eae39445ba144078b1305b7047 ]
The link information exists only on the leading hwfn,
but some of its derivatives [e.g., min/max rate] need to
be configured for each hwfn.
When re-basing the VF link view, use the leading hwfn
information as basis for all existing hwfns to allow
said configurations to stick.
Signed-off-by: Yuval Mintz <Yuval.Mintz(a)cavium.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/qlogic/qed/qed_sriov.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/qlogic/qed/qed_sriov.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_sriov.c
@@ -3573,6 +3573,7 @@ static int qed_get_vf_config(struct qed_
void qed_inform_vf_link_state(struct qed_hwfn *hwfn)
{
+ struct qed_hwfn *lead_hwfn = QED_LEADING_HWFN(hwfn->cdev);
struct qed_mcp_link_capabilities caps;
struct qed_mcp_link_params params;
struct qed_mcp_link_state link;
@@ -3589,9 +3590,15 @@ void qed_inform_vf_link_state(struct qed
if (!vf_info)
continue;
- memcpy(¶ms, qed_mcp_get_link_params(hwfn), sizeof(params));
- memcpy(&link, qed_mcp_get_link_state(hwfn), sizeof(link));
- memcpy(&caps, qed_mcp_get_link_capabilities(hwfn),
+ /* Only hwfn0 is actually interested in the link speed.
+ * But since only it would receive an MFW indication of link,
+ * need to take configuration from it - otherwise things like
+ * rate limiting for hwfn1 VF would not work.
+ */
+ memcpy(¶ms, qed_mcp_get_link_params(lead_hwfn),
+ sizeof(params));
+ memcpy(&link, qed_mcp_get_link_state(lead_hwfn), sizeof(link));
+ memcpy(&caps, qed_mcp_get_link_capabilities(lead_hwfn),
sizeof(caps));
/* Modify link according to the VF's configured link state */
Patches currently in stable-queue which might be from Yuval.Mintz(a)cavium.com are
queue-4.9/qed-always-publish-vf-link-from-leading-hwfn.patch
queue-4.9/qed-fix-tm-block-ilt-allocation.patch
queue-4.9/qed-correct-msi-x-for-storage.patch
This is a note to let you know that I've just added the patch titled
pwm: tegra: Increase precision in PWM rate calculation
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
pwm-tegra-increase-precision-in-pwm-rate-calculation.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Laxman Dewangan <ldewangan(a)nvidia.com>
Date: Fri, 7 Apr 2017 15:04:00 +0530
Subject: pwm: tegra: Increase precision in PWM rate calculation
From: Laxman Dewangan <ldewangan(a)nvidia.com>
[ Upstream commit 250b76f43f57d578ebff5e7211eb2c73aa5cd6ca ]
The rate of the PWM calculated as follows:
hz = NSEC_PER_SEC / period_ns;
rate = (rate + (hz / 2)) / hz;
This has the precision loss in lower PWM rate.
Change this to have more precision as:
hz = DIV_ROUND_CLOSEST_ULL(NSEC_PER_SEC * 100, period_ns);
rate = DIV_ROUND_CLOSEST(rate * 100, hz)
Example:
1. period_ns = 16672000, PWM clock rate is 200 KHz.
Based on old formula
hz = NSEC_PER_SEC / period_ns
= 1000000000ul/16672000
= 59 (59.98)
rate = (200K + 59/2)/59 = 3390
Based on new method:
hz = 5998
rate = DIV_ROUND_CLOSE(200000*100, 5998) = 3334
If we measure the PWM signal rate, we will get more accurate
period with rate value of 3334 instead of 3390.
2. period_ns = 16803898, PWM clock rate is 200 KHz.
Based on old formula:
hz = 59, rate = 3390
Based on new formula:
hz = 5951, rate = 3360
The PWM signal rate of 3360 is more near to requested period
than 3333.
Signed-off-by: Laxman Dewangan <ldewangan(a)nvidia.com>
Signed-off-by: Thierry Reding <thierry.reding(a)gmail.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/pwm/pwm-tegra.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/pwm/pwm-tegra.c
+++ b/drivers/pwm/pwm-tegra.c
@@ -76,6 +76,7 @@ static int tegra_pwm_config(struct pwm_c
struct tegra_pwm_chip *pc = to_tegra_pwm_chip(chip);
unsigned long long c = duty_ns;
unsigned long rate, hz;
+ unsigned long long ns100 = NSEC_PER_SEC;
u32 val = 0;
int err;
@@ -95,9 +96,11 @@ static int tegra_pwm_config(struct pwm_c
* cycles at the PWM clock rate will take period_ns nanoseconds.
*/
rate = clk_get_rate(pc->clk) >> PWM_DUTY_WIDTH;
- hz = NSEC_PER_SEC / period_ns;
- rate = (rate + (hz / 2)) / hz;
+ /* Consider precision in PWM_SCALE_WIDTH rate calculation */
+ ns100 *= 100;
+ hz = DIV_ROUND_CLOSEST_ULL(ns100, period_ns);
+ rate = DIV_ROUND_CLOSEST(rate * 100, hz);
/*
* Since the actual PWM divider is the register's frequency divider
Patches currently in stable-queue which might be from ldewangan(a)nvidia.com are
queue-4.9/pwm-tegra-increase-precision-in-pwm-rate-calculation.patch
This is a note to let you know that I've just added the patch titled
pwm: stmpe: Fix wrong register offset for hwpwm=2 case
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
pwm-stmpe-fix-wrong-register-offset-for-hwpwm-2-case.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Axel Lin <axel.lin(a)ingics.com>
Date: Tue, 7 Nov 2017 13:18:53 +0800
Subject: pwm: stmpe: Fix wrong register offset for hwpwm=2 case
From: Axel Lin <axel.lin(a)ingics.com>
[ Upstream commit 8472b529e113e0863ea064fdee51bf73c3f86fd6 ]
Fix trivial copy/paste bug.
Signed-off-by: Axel Lin <axel.lin(a)ingics.com>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Fixes: ef1f09eca74a ("pwm: Add a driver for the STMPE PWM")
Signed-off-by: Thierry Reding <thierry.reding(a)gmail.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/pwm/pwm-stmpe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pwm/pwm-stmpe.c
+++ b/drivers/pwm/pwm-stmpe.c
@@ -145,7 +145,7 @@ static int stmpe_24xx_pwm_config(struct
break;
case 2:
- offset = STMPE24XX_PWMIC1;
+ offset = STMPE24XX_PWMIC2;
break;
default:
Patches currently in stable-queue which might be from axel.lin(a)ingics.com are
queue-4.9/pwm-stmpe-fix-wrong-register-offset-for-hwpwm-2-case.patch
This is a note to let you know that I've just added the patch titled
printk: Correctly handle preemption in console_unlock()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
printk-correctly-handle-preemption-in-console_unlock.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Petr Mladek <pmladek(a)suse.com>
Date: Fri, 24 Mar 2017 17:14:05 +0100
Subject: printk: Correctly handle preemption in console_unlock()
From: Petr Mladek <pmladek(a)suse.com>
[ Upstream commit 257ab443118bffc7fdcef38f49cf59be68a3e362 ]
Some console drivers code calls console_conditional_schedule()
that looks at @console_may_schedule. The value must be cleared
when the drivers are called from console_unlock() with
interrupts disabled. But rescheduling is fine when the same
code is called, for example, from tty operations where the
console semaphore is taken via console_lock().
This is why @console_may_schedule is cleared before calling console
drivers. The original value is stored to decide if we could sleep
between lines.
Now, @console_may_schedule is not cleared when we call
console_trylock() and jump back to the "again" goto label.
This has become a problem, since the commit 6b97a20d3a7909daa066
("printk: set may_schedule for some of console_trylock() callers").
@console_may_schedule might get enabled now.
There is also the opposite problem. console_lock() can be called
only from preemptive context. It can always enable scheduling in
the console code. But console_trylock() is not able to detect it
when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the
original @console_may_schedule value after re-acquiring
the console semaphore in console_unlock().
This patch solves both problems by moving the "again" goto label.
Alternative solution was to clear and restore the value around
call_console_drivers(). Then console_conditional_schedule() could
be used also inside console_unlock(). But there was a potential race
with console_flush_on_panic() as reported by Sergey Senozhatsky.
That function should be called only where there is only one CPU
and with interrupts disabled. But better be on the safe side
because stopping CPUs might fail.
Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
Link: http://lkml.kernel.org/r/1490372045-22288-1-git-send-email-pmladek@suse.com
Suggested-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jslaby(a)suse.cz>
Cc: linux-fbdev(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Signed-off-by: Petr Mladek <pmladek(a)suse.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/printk/printk.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2342,7 +2342,7 @@ void console_unlock(void)
}
/*
- * Console drivers are called under logbuf_lock, so
+ * Console drivers are called with interrupts disabled, so
* @console_may_schedule should be cleared before; however, we may
* end up dumping a lot of lines, for example, if called from
* console registration path, and should invoke cond_resched()
@@ -2350,11 +2350,15 @@ void console_unlock(void)
* scheduling stall on a slow console leading to RCU stall and
* softlockup warnings which exacerbate the issue with more
* messages practically incapacitating the system.
+ *
+ * console_trylock() is not able to detect the preemptive
+ * context reliably. Therefore the value must be stored before
+ * and cleared after the the "again" goto label.
*/
do_cond_resched = console_may_schedule;
+again:
console_may_schedule = 0;
-again:
/*
* We released the console_sem lock, so we need to recheck if
* cpu is online and (if not) is there at least one CON_ANYTIME
Patches currently in stable-queue which might be from pmladek(a)suse.com are
queue-4.9/printk-correctly-handle-preemption-in-console_unlock.patch
queue-4.9/braille-console-fix-value-returned-by-_braille_console_setup.patch
This is a note to let you know that I've just added the patch titled
powerpc/nohash: Fix use of mmu_has_feature() in setup_initial_memory_limit()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-nohash-fix-use-of-mmu_has_feature-in-setup_initial_memory_limit.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Michael Ellerman <mpe(a)ellerman.id.au>
Date: Mon, 3 Apr 2017 12:05:55 +1000
Subject: powerpc/nohash: Fix use of mmu_has_feature() in setup_initial_memory_limit()
From: Michael Ellerman <mpe(a)ellerman.id.au>
[ Upstream commit 4868e3508d1934d28961f940ed6b9f1e347ab52c ]
setup_initial_memory_limit() is called from early_init_devtree(), which
runs prior to feature patching. If the kernel is built with CONFIG_JUMP_LABEL=y
and CONFIG_JUMP_LABEL_FEATURE_CHECKS=y then we will potentially get the
wrong value.
If we also have CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG=y we get a warning
and backtrace:
Warning! mmu_has_feature() used prior to jump label init!
CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-rc4-gccN-next-20170331-g6af2434 #1
Call Trace:
[c000000000fc3d50] [c000000000a26c30] .dump_stack+0xa8/0xe8 (unreliable)
[c000000000fc3de0] [c00000000002e6b8] .setup_initial_memory_limit+0xa4/0x104
[c000000000fc3e60] [c000000000d5c23c] .early_init_devtree+0xd0/0x2f8
[c000000000fc3f00] [c000000000d5d3b0] .early_setup+0x90/0x11c
[c000000000fc3f90] [c000000000000520] start_here_multiplatform+0x68/0x80
Fix it by using early_mmu_has_feature().
Fixes: c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()")
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/mm/tlb_nohash.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -751,7 +751,7 @@ void setup_initial_memory_limit(phys_add
* avoid going over total available memory just in case...
*/
#ifdef CONFIG_PPC_FSL_BOOK3E
- if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) {
+ if (early_mmu_has_feature(MMU_FTR_TYPE_FSL_E)) {
unsigned long linear_sz;
unsigned int num_cams;
Patches currently in stable-queue which might be from mpe(a)ellerman.id.au are
queue-4.9/powerpc-nohash-fix-use-of-mmu_has_feature-in-setup_initial_memory_limit.patch
queue-4.9/powerpc-avoid-taking-a-data-miss-on-every-userspace-instruction-miss.patch
queue-4.9/powerpc-modules-don-t-try-to-restore-r2-after-a-sibling-call.patch
queue-4.9/powerpc-xmon-fix-an-unexpected-xmon-on-off-state-change.patch
queue-4.9/powerpc-mm-hugetlb-filter-out-hugepage-size-not-supported-by-page-table-layout.patch
This is a note to let you know that I've just added the patch titled
powerpc/modules: Don't try to restore r2 after a sibling call
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-modules-don-t-try-to-restore-r2-after-a-sibling-call.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
Date: Thu, 16 Nov 2017 11:45:37 -0600
Subject: powerpc/modules: Don't try to restore r2 after a sibling call
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
[ Upstream commit b9eab08d012fa093947b230f9a87257c27fb829b ]
When attempting to load a livepatch module, I got the following error:
module_64: patch_module: Expect noop after relocate, got 3c820000
The error was triggered by the following code in
unregister_netdevice_queue():
14c: 00 00 00 48 b 14c <unregister_netdevice_queue+0x14c>
14c: R_PPC64_REL24 net_set_todo
150: 00 00 82 3c addis r4,r2,0
GCC didn't insert a nop after the branch to net_set_todo() because it's
a sibling call, so it never returns. The nop isn't needed after the
branch in that case.
Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Acked-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
Reviewed-and-tested-by: Kamalesh Babulal <kamalesh(a)linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/include/asm/code-patching.h | 1 +
arch/powerpc/kernel/module_64.c | 12 +++++++++++-
arch/powerpc/lib/code-patching.c | 5 +++++
3 files changed, 17 insertions(+), 1 deletion(-)
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -30,6 +30,7 @@ int patch_branch(unsigned int *addr, uns
int patch_instruction(unsigned int *addr, unsigned int instr);
int instr_is_relative_branch(unsigned int instr);
+int instr_is_relative_link_branch(unsigned int instr);
int instr_is_branch_to_addr(const unsigned int *instr, unsigned long addr);
unsigned long branch_target(const unsigned int *instr);
unsigned int translate_branch(const unsigned int *dest,
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -494,7 +494,17 @@ static bool is_early_mcount_callsite(u32
restore r2. */
static int restore_r2(u32 *instruction, struct module *me)
{
- if (is_early_mcount_callsite(instruction - 1))
+ u32 *prev_insn = instruction - 1;
+
+ if (is_early_mcount_callsite(prev_insn))
+ return 1;
+
+ /*
+ * Make sure the branch isn't a sibling call. Sibling calls aren't
+ * "link" branches and they don't return, so they don't need the r2
+ * restore afterwards.
+ */
+ if (!instr_is_relative_link_branch(*prev_insn))
return 1;
if (*instruction != PPC_INST_NOP) {
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -95,6 +95,11 @@ int instr_is_relative_branch(unsigned in
return instr_is_branch_iform(instr) || instr_is_branch_bform(instr);
}
+int instr_is_relative_link_branch(unsigned int instr)
+{
+ return instr_is_relative_branch(instr) && (instr & BRANCH_SET_LINK);
+}
+
static unsigned long branch_iform_target(const unsigned int *instr)
{
signed long imm;
Patches currently in stable-queue which might be from jpoimboe(a)redhat.com are
queue-4.9/powerpc-modules-don-t-try-to-restore-r2-after-a-sibling-call.patch
queue-4.9/kprobes-x86-set-kprobes-pages-read-only.patch
queue-4.9/kprobes-x86-fix-kprobe-booster-not-to-boost-far-call-instructions.patch
queue-4.9/x86-boot-32-defer-resyncing-initial_page_table-until-per-cpu-is-set-up.patch
This is a note to let you know that I've just added the patch titled
powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-mm-hugetlb-filter-out-hugepage-size-not-supported-by-page-table-layout.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.vnet.ibm.com>
Date: Tue, 21 Mar 2017 22:59:56 +0530
Subject: powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout
From: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.vnet.ibm.com>
[ Upstream commit a525108cf1cc14651602d678da38fa627a76a724 ]
Without this if firmware reports 1MB page size support we will crash
trying to use 1MB as hugetlb page size.
echo 300 > /sys/kernel/mm/hugepages/hugepages-1024kB/nr_hugepages
kernel BUG at ./arch/powerpc/include/asm/hugetlb.h:19!
.....
....
[c0000000e2c27b30] c00000000029dae8 .hugetlb_fault+0x638/0xda0
[c0000000e2c27c30] c00000000026fb64 .handle_mm_fault+0x844/0x1d70
[c0000000e2c27d70] c00000000004805c .do_page_fault+0x3dc/0x7c0
[c0000000e2c27e30] c00000000000ac98 handle_page_fault+0x10/0x30
With fix, we don't enable 1MB as hugepage size.
bash-4.2# cd /sys/kernel/mm/hugepages/
bash-4.2# ls
hugepages-16384kB hugepages-16777216kB
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/mm/hugetlbpage.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -765,6 +765,24 @@ static int __init add_huge_page_size(uns
if ((mmu_psize = shift_to_mmu_psize(shift)) < 0)
return -EINVAL;
+#ifdef CONFIG_PPC_BOOK3S_64
+ /*
+ * We need to make sure that for different page sizes reported by
+ * firmware we only add hugetlb support for page sizes that can be
+ * supported by linux page table layout.
+ * For now we have
+ * Radix: 2M
+ * Hash: 16M and 16G
+ */
+ if (radix_enabled()) {
+ if (mmu_psize != MMU_PAGE_2M)
+ return -EINVAL;
+ } else {
+ if (mmu_psize != MMU_PAGE_16M && mmu_psize != MMU_PAGE_16G)
+ return -EINVAL;
+ }
+#endif
+
BUG_ON(mmu_psize_defs[mmu_psize].shift != shift);
/* Return if huge page size has already been setup */
Patches currently in stable-queue which might be from aneesh.kumar(a)linux.vnet.ibm.com are
queue-4.9/mm-fix-false-positive-vm_bug_on-in-page_cache_-get-add-_speculative.patch
queue-4.9/powerpc-mm-hugetlb-filter-out-hugepage-size-not-supported-by-page-table-layout.patch
This is a note to let you know that I've just added the patch titled
powerpc: Avoid taking a data miss on every userspace instruction miss
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-avoid-taking-a-data-miss-on-every-userspace-instruction-miss.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Anton Blanchard <anton(a)samba.org>
Date: Mon, 3 Apr 2017 16:41:02 +1000
Subject: powerpc: Avoid taking a data miss on every userspace instruction miss
From: Anton Blanchard <anton(a)samba.org>
[ Upstream commit a7a9dcd882a67b68568868b988289fce5ffd8419 ]
Early on in do_page_fault() we call store_updates_sp(), regardless of
the type of exception. For an instruction miss this doesn't make
sense, because we only use this information to detect if a data miss
is the result of a stack expansion instruction or not.
Worse still, it results in a data miss within every userspace
instruction miss handler, because we try and load the very instruction
we are about to install a pte for!
A simple exec microbenchmark runs 6% faster on POWER8 with this fix:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
unsigned long left = atol(argv[1]);
char leftstr[16];
if (left-- == 0)
return 0;
sprintf(leftstr, "%ld", left);
execlp(argv[0], argv[0], leftstr, NULL);
perror("exec failed\n");
return 0;
}
Pass the number of iterations on the command line (eg 10000) and time
how long it takes to execute.
Signed-off-by: Anton Blanchard <anton(a)samba.org>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/mm/fault.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -294,7 +294,7 @@ int do_page_fault(struct pt_regs *regs,
* can result in fault, which will cause a deadlock when called with
* mmap_sem held
*/
- if (user_mode(regs))
+ if (!is_exec && user_mode(regs))
store_update_sp = store_updates_sp(regs);
if (user_mode(regs))
Patches currently in stable-queue which might be from anton(a)samba.org are
queue-4.9/powerpc-avoid-taking-a-data-miss-on-every-userspace-instruction-miss.patch
This is a note to let you know that I've just added the patch titled
power: supply: ab8500_charger: Fix an error handling path
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
power-supply-ab8500_charger-fix-an-error-handling-path.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Date: Wed, 22 Nov 2017 21:27:31 +0100
Subject: power: supply: ab8500_charger: Fix an error handling path
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
[ Upstream commit bf59fddde1c3eab89eb8dca8f3d3dc097887d2bb ]
'ret' is know to be 0 at this point, because it has not been updated by the
the previous call to 'abx500_mask_and_set_register_interruptible()'.
Fix it by updating 'ret' before checking if an error occurred.
Fixes: 84edbeeab67c ("ab8500-charger: AB8500 charger driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.co.uk>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/power/supply/ab8500_charger.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/power/supply/ab8500_charger.c
+++ b/drivers/power/supply/ab8500_charger.c
@@ -3218,7 +3218,7 @@ static int ab8500_charger_init_hw_regist
}
/* Enable backup battery charging */
- abx500_mask_and_set_register_interruptible(di->dev,
+ ret = abx500_mask_and_set_register_interruptible(di->dev,
AB8500_RTC, AB8500_RTC_CTRL_REG,
RTC_BUP_CH_ENA, RTC_BUP_CH_ENA);
if (ret < 0)
Patches currently in stable-queue which might be from christophe.jaillet(a)wanadoo.fr are
queue-4.9/power-supply-ab8500_charger-fix-an-error-handling-path.patch
queue-4.9/power-supply-ab8500_charger-bail-out-in-case-of-error-in-ab8500_charger_init_hw_registers.patch
This is a note to let you know that I've just added the patch titled
power: supply: ab8500_charger: Bail out in case of error in 'ab8500_charger_init_hw_registers()'
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
power-supply-ab8500_charger-bail-out-in-case-of-error-in-ab8500_charger_init_hw_registers.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Date: Wed, 22 Nov 2017 21:31:20 +0100
Subject: power: supply: ab8500_charger: Bail out in case of error in 'ab8500_charger_init_hw_registers()'
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
[ Upstream commit 09edcb647542487864e23aa8d2ef26be3e08978a ]
If an error occurs when we enable the backup battery charging, we should
go through the error handling path directly.
Before commit db43e6c473b5 ("ab8500-bm: Add usb power path support") this
was the case, but this commit has added some code between the last test and
the 'out' label.
So, in case of error, this added code is executed and the error may be
silently ignored.
Fix it by adding the missing 'goto out', as done in all other error
handling paths.
Fixes: db43e6c473b5 ("ab8500-bm: Add usb power path support")
Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.co.uk>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/power/supply/ab8500_charger.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/power/supply/ab8500_charger.c
+++ b/drivers/power/supply/ab8500_charger.c
@@ -3221,8 +3221,10 @@ static int ab8500_charger_init_hw_regist
ret = abx500_mask_and_set_register_interruptible(di->dev,
AB8500_RTC, AB8500_RTC_CTRL_REG,
RTC_BUP_CH_ENA, RTC_BUP_CH_ENA);
- if (ret < 0)
+ if (ret < 0) {
dev_err(di->dev, "%s mask and set failed\n", __func__);
+ goto out;
+ }
if (is_ab8540(di->parent)) {
ret = abx500_mask_and_set_register_interruptible(di->dev,
Patches currently in stable-queue which might be from christophe.jaillet(a)wanadoo.fr are
queue-4.9/power-supply-ab8500_charger-fix-an-error-handling-path.patch
queue-4.9/power-supply-ab8500_charger-bail-out-in-case-of-error-in-ab8500_charger_init_hw_registers.patch
This is a note to let you know that I've just added the patch titled
perf trace: Handle unpaired raw_syscalls:sys_exit event
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-trace-handle-unpaired-raw_syscalls-sys_exit-event.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Date: Wed, 29 Mar 2017 16:37:51 -0300
Subject: perf trace: Handle unpaired raw_syscalls:sys_exit event
From: Arnaldo Carvalho de Melo <acme(a)redhat.com>
[ Upstream commit fd2b2975149f5f7099693027cece81b16842964a ]
Which may happen when we start a tracing session and a thread is waiting
for something like "poll" to return, in which case we better print "?"
both for the syscall entry timestamp and for the duration.
E.g.:
Tracing existing mutt session:
# perf trace -p `pidof mutt`
? ( ? ): mutt/17135 ... [continued]: poll()) = 1
0.027 ( 0.013 ms): mutt/17135 read(buf: 0x7ffcb3c42cef, count: 1) = 1
0.047 ( 0.008 ms): mutt/17135 poll(ufds: 0x7ffcb3c42c50, nfds: 1, timeout_msecs: 1000) = 1
0.059 ( 0.008 ms): mutt/17135 read(buf: 0x7ffcb3c42cef, count: 1) = 1
<SNIP>
Before it would print a large number because we'd do:
ttrace->entry_time - trace->base_time
And entry_time would be 0, while base_time would be the timestamp for
the first event 'perf trace' reads, oops.
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: David Ahern <dsahern(a)gmail.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Luis Claudio Gonçalves <lclaudio(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Wang Nan <wangnan0(a)huawei.com>
Link: http://lkml.kernel.org/n/tip-wbcb93ofva2qdjd5ltn5eeqq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/builtin-trace.c | 43 ++++++++++++++++++++++++++++++++++---------
1 file changed, 34 insertions(+), 9 deletions(-)
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -822,12 +822,21 @@ struct syscall {
void **arg_parm;
};
-static size_t fprintf_duration(unsigned long t, FILE *fp)
+/*
+ * We need to have this 'calculated' boolean because in some cases we really
+ * don't know what is the duration of a syscall, for instance, when we start
+ * a session and some threads are waiting for a syscall to finish, say 'poll',
+ * in which case all we can do is to print "( ? ) for duration and for the
+ * start timestamp.
+ */
+static size_t fprintf_duration(unsigned long t, bool calculated, FILE *fp)
{
double duration = (double)t / NSEC_PER_MSEC;
size_t printed = fprintf(fp, "(");
- if (duration >= 1.0)
+ if (!calculated)
+ printed += fprintf(fp, " ? ");
+ else if (duration >= 1.0)
printed += color_fprintf(fp, PERF_COLOR_RED, "%6.3f ms", duration);
else if (duration >= 0.01)
printed += color_fprintf(fp, PERF_COLOR_YELLOW, "%6.3f ms", duration);
@@ -1030,13 +1039,27 @@ static bool trace__filter_duration(struc
return t < (trace->duration_filter * NSEC_PER_MSEC);
}
-static size_t trace__fprintf_tstamp(struct trace *trace, u64 tstamp, FILE *fp)
+static size_t __trace__fprintf_tstamp(struct trace *trace, u64 tstamp, FILE *fp)
{
double ts = (double)(tstamp - trace->base_time) / NSEC_PER_MSEC;
return fprintf(fp, "%10.3f ", ts);
}
+/*
+ * We're handling tstamp=0 as an undefined tstamp, i.e. like when we are
+ * using ttrace->entry_time for a thread that receives a sys_exit without
+ * first having received a sys_enter ("poll" issued before tracing session
+ * starts, lost sys_enter exit due to ring buffer overflow).
+ */
+static size_t trace__fprintf_tstamp(struct trace *trace, u64 tstamp, FILE *fp)
+{
+ if (tstamp > 0)
+ return __trace__fprintf_tstamp(trace, tstamp, fp);
+
+ return fprintf(fp, " ? ");
+}
+
static bool done = false;
static bool interrupted = false;
@@ -1047,10 +1070,10 @@ static void sig_handler(int sig)
}
static size_t trace__fprintf_entry_head(struct trace *trace, struct thread *thread,
- u64 duration, u64 tstamp, FILE *fp)
+ u64 duration, bool duration_calculated, u64 tstamp, FILE *fp)
{
size_t printed = trace__fprintf_tstamp(trace, tstamp, fp);
- printed += fprintf_duration(duration, fp);
+ printed += fprintf_duration(duration, duration_calculated, fp);
if (trace->multiple_threads) {
if (trace->show_comm)
@@ -1452,7 +1475,7 @@ static int trace__printf_interrupted_ent
duration = sample->time - ttrace->entry_time;
- printed = trace__fprintf_entry_head(trace, trace->current, duration, ttrace->entry_time, trace->output);
+ printed = trace__fprintf_entry_head(trace, trace->current, duration, true, ttrace->entry_time, trace->output);
printed += fprintf(trace->output, "%-70s) ...\n", ttrace->entry_str);
ttrace->entry_pending = false;
@@ -1499,7 +1522,7 @@ static int trace__sys_enter(struct trace
if (sc->is_exit) {
if (!(trace->duration_filter || trace->summary_only || trace->min_stack)) {
- trace__fprintf_entry_head(trace, thread, 1, ttrace->entry_time, trace->output);
+ trace__fprintf_entry_head(trace, thread, 0, false, ttrace->entry_time, trace->output);
fprintf(trace->output, "%-70s)\n", ttrace->entry_str);
}
} else {
@@ -1547,6 +1570,7 @@ static int trace__sys_exit(struct trace
{
long ret;
u64 duration = 0;
+ bool duration_calculated = false;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0;
struct syscall *sc = trace__syscall_info(trace, evsel, id);
@@ -1577,6 +1601,7 @@ static int trace__sys_exit(struct trace
duration = sample->time - ttrace->entry_time;
if (trace__filter_duration(trace, duration))
goto out;
+ duration_calculated = true;
} else if (trace->duration_filter)
goto out;
@@ -1592,7 +1617,7 @@ static int trace__sys_exit(struct trace
if (trace->summary_only)
goto out;
- trace__fprintf_entry_head(trace, thread, duration, ttrace->entry_time, trace->output);
+ trace__fprintf_entry_head(trace, thread, duration, duration_calculated, ttrace->entry_time, trace->output);
if (ttrace->entry_pending) {
fprintf(trace->output, "%-70s", ttrace->entry_str);
@@ -1855,7 +1880,7 @@ static int trace__pgfault(struct trace *
thread__find_addr_location(thread, sample->cpumode, MAP__FUNCTION,
sample->ip, &al);
- trace__fprintf_entry_head(trace, thread, 0, sample->time, trace->output);
+ trace__fprintf_entry_head(trace, thread, 0, true, sample->time, trace->output);
fprintf(trace->output, "%sfault [",
evsel->attr.config == PERF_COUNT_SW_PAGE_FAULTS_MAJ ?
Patches currently in stable-queue which might be from acme(a)redhat.com are
queue-4.9/perf-buildid-do-not-assume-that-readlink-returns-a-null-terminated-string.patch
queue-4.9/perf-session-don-t-rely-on-evlist-in-pipe-mode.patch
queue-4.9/perf-tools-make-perf_event__synthesize_mmap_events-scale.patch
queue-4.9/perf-annotate-fix-a-bug-following-symbolic-link-of-a-build-id-file.patch
queue-4.9/perf-stat-issue-a-hw-watchdog-disable-hint.patch
queue-4.9/perf-sort-fix-segfault-with-basic-block-cycles-sort-dimension.patch
queue-4.9/perf-trace-handle-unpaired-raw_syscalls-sys_exit-event.patch
queue-4.9/perf-probe-fix-concat_probe_trace_events.patch
queue-4.9/perf-inject-copy-events-when-reordering-events-in-pipe-mode.patch
queue-4.9/perf-probe-return-errno-when-not-hitting-any-event.patch
queue-4.9/perf-evsel-return-exact-sub-event-which-failed-with-eperm-for-wildcards.patch
queue-4.9/perf-stat-fix-bug-in-handling-events-in-error-state.patch
This is a note to let you know that I've just added the patch titled
perf tools: Make perf_event__synthesize_mmap_events() scale
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-tools-make-perf_event__synthesize_mmap_events-scale.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Stephane Eranian <eranian(a)google.com>
Date: Wed, 15 Mar 2017 10:17:13 -0700
Subject: perf tools: Make perf_event__synthesize_mmap_events() scale
From: Stephane Eranian <eranian(a)google.com>
[ Upstream commit 88b897a30c525c2eee6e7f16e1e8d0f18830845e ]
This patch significantly improves the execution time of
perf_event__synthesize_mmap_events() when running perf record on systems
where processes have lots of threads.
It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
generate each map line in the maps file. If you have 1000 threads, then you
have necessarily 1000 stacks. For each vma, you need to check if it
corresponds to a thread's stack. With a large number of threads, this can take
a very long time. I have seen latencies >> 10mn.
As of today, perf does not use the fact that a mapping is a stack, therefore we
can work around the issue by using /proc/pid/tasks/pid/maps. This entry does
not try to map a vma to stack and is thus much faster with no loss of
functonality.
The proc-map-timeout logic is kept in case users still want some upper limit.
In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
/proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this.
Committer note:
This problem seems to have been elliminated in the kernel since commit :
b18cb64ead40 ("fs/proc: Stop trying to report thread stacks").
Signed-off-by: Stephane Eranian <eranian(a)google.com>
Acked-by: Jiri Olsa <jolsa(a)redhat.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com
Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.c…
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/event.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(s
if (machine__is_default_guest(machine))
return 0;
- snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
- machine->root_dir, pid);
+ snprintf(filename, sizeof(filename), "%s/proc/%d/task/%d/maps",
+ machine->root_dir, pid, pid);
fp = fopen(filename, "r");
if (fp == NULL) {
Patches currently in stable-queue which might be from eranian(a)google.com are
queue-4.9/perf-session-don-t-rely-on-evlist-in-pipe-mode.patch
queue-4.9/perf-tools-make-perf_event__synthesize_mmap_events-scale.patch
queue-4.9/perf-inject-copy-events-when-reordering-events-in-pipe-mode.patch
queue-4.9/perf-stat-fix-bug-in-handling-events-in-error-state.patch
This is a note to let you know that I've just added the patch titled
perf stat: Issue a HW watchdog disable hint
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-stat-issue-a-hw-watchdog-disable-hint.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 7 Feb 2017 01:40:05 +0100
Subject: perf stat: Issue a HW watchdog disable hint
From: Borislav Petkov <bp(a)suse.de>
[ Upstream commit 02d492e5dcb72c004d213756eb87c9d62a6d76a7 ]
When using perf stat on an AMD F15h system with the default hw events
attributes, some of the events don't get counted:
Performance counter stats for 'sleep 1':
0.749208 task-clock (msec) # 0.001 CPUs utilized
1 context-switches # 0.001 M/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.072 M/sec
1,122,815 cycles # 1.499 GHz
286,740 stalled-cycles-frontend # 25.54% frontend cycles idle
<not counted> stalled-cycles-backend (0.00%)
^^^^^^^^^^^^
<not counted> instructions (0.00%)
^^^^^^^^^^^^
<not counted> branches (0.00%)
<not counted> branch-misses (0.00%)
1.001550070 seconds time elapsed
The reason is that we have the HW watchdog consuming one PMU counter and
when perf tries to schedule 6 events on 6 counters and some of those
counters are constrained to only a specific subset of PMCs by the
hardware, the event scheduling fails.
So issue a hint to disable the HW watchdog around a perf stat session.
Committer note:
Testing it...
# perf stat -d usleep 1
Performance counter stats for 'usleep 1':
1.180203 task-clock (msec) # 0.490 CPUs utilized
1 context-switches # 0.847 K/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.046 M/sec
184,754 cycles # 0.157 GHz
714,553 instructions # 3.87 insn per cycle
154,661 branches # 131.046 M/sec
7,247 branch-misses # 4.69% of all branches
219,984 L1-dcache-loads # 186.395 M/sec
17,600 L1-dcache-load-misses # 8.00% of all L1-dcache hits (90.16%)
<not counted> LLC-loads (0.00%)
<not counted> LLC-load-misses (0.00%)
0.002406823 seconds time elapsed
Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
#
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Acked-by: Ingo Molnar <mingo(a)kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Robert Richter <rric(a)kernel.org>
Cc: Vince Weaver <vince(a)deater.net>
Link: http://lkml.kernel.org/r/20170211183218.ijnvb5f7ciyuunx4@pd.tnic
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/builtin-stat.c | 11 +++++++++++
1 file changed, 11 insertions(+)
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -146,6 +146,7 @@ static aggr_get_id_t aggr_get_id;
static bool append_file;
static const char *output_name;
static int output_fd;
+static int print_free_counters_hint;
struct perf_stat {
bool record;
@@ -1109,6 +1110,9 @@ static void printout(int id, int nr, str
counter->supported ? CNTR_NOT_COUNTED : CNTR_NOT_SUPPORTED,
csv_sep);
+ if (counter->supported)
+ print_free_counters_hint = 1;
+
fprintf(stat_config.output, "%-*s%s",
csv_output ? 0 : unit_width,
counter->unit, csv_sep);
@@ -1477,6 +1481,13 @@ static void print_footer(void)
avg_stats(&walltime_nsecs_stats));
}
fprintf(output, "\n\n");
+
+ if (print_free_counters_hint)
+ fprintf(output,
+"Some events weren't counted. Try disabling the NMI watchdog:\n"
+" echo 0 > /proc/sys/kernel/nmi_watchdog\n"
+" perf stat ...\n"
+" echo 1 > /proc/sys/kernel/nmi_watchdog\n");
}
static void print_counters(struct timespec *ts, int argc, const char **argv)
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-mce-handle-broadcasted-mce-gracefully-with-kexec.patch
queue-4.9/perf-stat-issue-a-hw-watchdog-disable-hint.patch
queue-4.9/x86-mm-make-mmap-map_32bit-work-correctly.patch
queue-4.9/edac-altera-fix-peripheral-warnings-for-cyclone5.patch
queue-4.9/kvm-svm-setup-mcg_cap-on-amd-properly.patch
queue-4.9/x86-mce-init-some-cpu-features-early.patch
This is a note to let you know that I've just added the patch titled
perf stat: Fix bug in handling events in error state
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-stat-fix-bug-in-handling-events-in-error-state.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Stephane Eranian <eranian(a)google.com>
Date: Wed, 12 Apr 2017 11:23:01 -0700
Subject: perf stat: Fix bug in handling events in error state
From: Stephane Eranian <eranian(a)google.com>
[ Upstream commit db49a71798a38f3ddf3f3462703328dca39b1ac7 ]
(This is a patch has been sitting in the Intel CQM/CMT driver series for
a while, despite not depend on it. Sending it now independently since
the series is being discarded.)
When an event is in error state, read() returns 0 instead of sizeof()
buffer. In certain modes, such as interval printing, ignoring the 0
return value may cause bogus count deltas to be computed and thus
invalid results printed.
This patch fixes this problem by modifying read_counters() to mark the
event as not scaled (scaled = -1) to force the printout routine to show
<NOT COUNTED>.
Signed-off-by: Stephane Eranian <eranian(a)google.com>
Reviewed-by: David Carrillo-Cisneros <davidcc(a)google.com>
Acked-by: Jiri Olsa <jolsa(a)redhat.com>
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Paul Turner <pjt(a)google.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Wang Nan <wangnan0(a)huawei.com>
Link: http://lkml.kernel.org/r/20170412182301.44406-1-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/builtin-stat.c | 12 +++++++++---
tools/perf/util/evsel.c | 4 ++--
2 files changed, 11 insertions(+), 5 deletions(-)
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -311,8 +311,12 @@ static int read_counter(struct perf_evse
struct perf_counts_values *count;
count = perf_counts(counter->counts, cpu, thread);
- if (perf_evsel__read(counter, cpu, thread, count))
+ if (perf_evsel__read(counter, cpu, thread, count)) {
+ counter->counts->scaled = -1;
+ perf_counts(counter->counts, cpu, thread)->ena = 0;
+ perf_counts(counter->counts, cpu, thread)->run = 0;
return -1;
+ }
if (STAT_RECORD) {
if (perf_evsel__write_stat_event(counter, cpu, thread, count)) {
@@ -337,12 +341,14 @@ static int read_counter(struct perf_evse
static void read_counters(void)
{
struct perf_evsel *counter;
+ int ret;
evlist__for_each_entry(evsel_list, counter) {
- if (read_counter(counter))
+ ret = read_counter(counter);
+ if (ret)
pr_debug("failed to read counter %s\n", counter->name);
- if (perf_stat_process_counter(&stat_config, counter))
+ if (ret == 0 && perf_stat_process_counter(&stat_config, counter))
pr_warning("failed to process counter %s\n", counter->name);
}
}
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1221,7 +1221,7 @@ int perf_evsel__read(struct perf_evsel *
if (FD(evsel, cpu, thread) < 0)
return -EINVAL;
- if (readn(FD(evsel, cpu, thread), count, sizeof(*count)) < 0)
+ if (readn(FD(evsel, cpu, thread), count, sizeof(*count)) <= 0)
return -errno;
return 0;
@@ -1239,7 +1239,7 @@ int __perf_evsel__read_on_cpu(struct per
if (evsel->counts == NULL && perf_evsel__alloc_counts(evsel, cpu + 1, thread + 1) < 0)
return -ENOMEM;
- if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) < 0)
+ if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) <= 0)
return -errno;
perf_evsel__compute_deltas(evsel, cpu, thread, &count);
Patches currently in stable-queue which might be from eranian(a)google.com are
queue-4.9/perf-session-don-t-rely-on-evlist-in-pipe-mode.patch
queue-4.9/perf-tools-make-perf_event__synthesize_mmap_events-scale.patch
queue-4.9/perf-inject-copy-events-when-reordering-events-in-pipe-mode.patch
queue-4.9/perf-stat-fix-bug-in-handling-events-in-error-state.patch
This is a note to let you know that I've just added the patch titled
perf sort: Fix segfault with basic block 'cycles' sort dimension
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-sort-fix-segfault-with-basic-block-cycles-sort-dimension.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Changbin Du <changbin.du(a)intel.com>
Date: Mon, 13 Mar 2017 16:31:48 +0800
Subject: perf sort: Fix segfault with basic block 'cycles' sort dimension
From: Changbin Du <changbin.du(a)intel.com>
[ Upstream commit 4b0b3aa6a2756e6115fdf275c521e4552a7082f3 ]
Skip the sample which doesn't have branch_info to avoid segmentation
fault:
The fault can be reproduced by:
perf record -a
perf report -F cycles
Signed-off-by: Changbin Du <changbin.du(a)intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Fixes: 0e332f033a82 ("perf tools: Add support for cycles, weight branch_info field")
Link: http://lkml.kernel.org/r/20170313083148.23568-1-changbin.du@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/sort.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -846,6 +846,9 @@ static int hist_entry__mispredict_snprin
static int64_t
sort__cycles_cmp(struct hist_entry *left, struct hist_entry *right)
{
+ if (!left->branch_info || !right->branch_info)
+ return cmp_null(left->branch_info, right->branch_info);
+
return left->branch_info->flags.cycles -
right->branch_info->flags.cycles;
}
@@ -853,6 +856,8 @@ sort__cycles_cmp(struct hist_entry *left
static int hist_entry__cycles_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
+ if (!he->branch_info)
+ return scnprintf(bf, size, "%-.*s", width, "N/A");
if (he->branch_info->flags.cycles == 0)
return repsep_snprintf(bf, size, "%-*s", width, "-");
return repsep_snprintf(bf, size, "%-*hd", width,
Patches currently in stable-queue which might be from changbin.du(a)intel.com are
queue-4.9/perf-sort-fix-segfault-with-basic-block-cycles-sort-dimension.patch
This is a note to let you know that I've just added the patch titled
perf session: Don't rely on evlist in pipe mode
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-session-don-t-rely-on-evlist-in-pipe-mode.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: David Carrillo-Cisneros <davidcc(a)google.com>
Date: Mon, 10 Apr 2017 13:14:30 -0700
Subject: perf session: Don't rely on evlist in pipe mode
From: David Carrillo-Cisneros <davidcc(a)google.com>
[ Upstream commit 0973ad97c187e06aece61f685b9c3b2d93290a73 ]
Session sets a number parameters that rely on evlist. These parameters
are not used in pipe-mode and should not be set, since evlist is
unavailable. Fix that.
Signed-off-by: David Carrillo-Cisneros <davidcc(a)google.com>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: He Kuang <hekuang(a)huawei.com>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Paul Turner <pjt(a)google.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Simon Que <sque(a)chromium.org>
Cc: Stephane Eranian <eranian(a)google.com>
Cc: Wang Nan <wangnan0(a)huawei.com>
Link: http://lkml.kernel.org/r/20170410201432.24807-6-davidcc@google.com
[ Check if file != NULL in perf_session__new(), like when used by builtin-top.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/session.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -139,8 +139,14 @@ struct perf_session *perf_session__new(s
if (perf_session__open(session) < 0)
goto out_close;
- perf_session__set_id_hdr_size(session);
- perf_session__set_comm_exec(session);
+ /*
+ * set session attributes that are present in perf.data
+ * but not in pipe-mode.
+ */
+ if (!file->is_pipe) {
+ perf_session__set_id_hdr_size(session);
+ perf_session__set_comm_exec(session);
+ }
}
} else {
session->machines.host.env = &perf_env;
@@ -155,7 +161,11 @@ struct perf_session *perf_session__new(s
pr_warning("Cannot read kernel map\n");
}
- if (tool && tool->ordering_requires_timestamps &&
+ /*
+ * In pipe-mode, evlist is empty until PERF_RECORD_HEADER_ATTR is
+ * processed, so perf_evlist__sample_id_all is not meaningful here.
+ */
+ if ((!file || !file->is_pipe) && tool && tool->ordering_requires_timestamps &&
tool->ordered_events && !perf_evlist__sample_id_all(session->evlist)) {
dump_printf("WARNING: No sample_id_all support, falling back to unordered processing\n");
tool->ordered_events = false;
Patches currently in stable-queue which might be from davidcc(a)google.com are
queue-4.9/perf-session-don-t-rely-on-evlist-in-pipe-mode.patch
queue-4.9/perf-inject-copy-events-when-reordering-events-in-pipe-mode.patch
queue-4.9/perf-stat-fix-bug-in-handling-events-in-error-state.patch
This is a note to let you know that I've just added the patch titled
perf probe: Return errno when not hitting any event
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-probe-return-errno-when-not-hitting-any-event.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Date: Fri, 17 Mar 2017 16:16:32 +0800
Subject: perf probe: Return errno when not hitting any event
From: Kefeng Wang <wangkefeng.wang(a)huawei.com>
[ Upstream commit 70946723eeb859466f026274b29c6196e39149c4 ]
On old perf, when using 'perf probe -d' to delete an inexistent event,
it returns errno, eg,
-bash-4.3# perf probe -d xxx || echo $?
Info: Event "*:xxx" does not exist.
Error: Failed to delete events.
255
But now perf_del_probe_events() will always set ret = 0, different from
previous del_perf_probe_events(). After this, it returns errno again,
eg,
-bash-4.3# ./perf probe -d xxx || echo $?
"xxx" does not hit any event.
Error: Failed to delete events.
254
And it is more appropriate to return -ENOENT instead of -EPERM.
Signed-off-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Hanjun Guo <guohanjun(a)huawei.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Wang Nan <wangnan0(a)huawei.com>
Fixes: dddc7ee32fa1 ("perf probe: Fix an error when deleting probes successfully")
Link: http://lkml.kernel.org/r/1489738592-61011-1-git-send-email-wangkefeng.wang@…
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/builtin-probe.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -442,9 +442,9 @@ static int perf_del_probe_events(struct
}
if (ret == -ENOENT && ret2 == -ENOENT)
- pr_debug("\"%s\" does not hit any event.\n", str);
- /* Note that this is silently ignored */
- ret = 0;
+ pr_warning("\"%s\" does not hit any event.\n", str);
+ else
+ ret = 0;
error:
if (kfd >= 0)
Patches currently in stable-queue which might be from wangkefeng.wang(a)huawei.com are
queue-4.9/perf-probe-return-errno-when-not-hitting-any-event.patch
This is a note to let you know that I've just added the patch titled
perf probe: Fix concat_probe_trace_events
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-probe-fix-concat_probe_trace_events.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com>
Date: Wed, 8 Mar 2017 12:29:07 +0530
Subject: perf probe: Fix concat_probe_trace_events
From: Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com>
[ Upstream commit f0a30dca5f84fe8048271799b56677ac2279de66 ]
'*ntevs' contains number of elements present in 'tevs' array. If there
are no elements in array, 'tevs2' can be directly assigned to 'tevs'
without allocating more space. So the condition should be '*ntevs == 0'
not 'ntevs == 0'.
Signed-off-by: Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Fixes: 42bba263eb58 ("perf probe: Allow wildcard for cached events")
Link: http://lkml.kernel.org/r/20170308065908.4128-1-ravi.bangoria@linux.vnet.ibm…
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/probe-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -3060,7 +3060,7 @@ concat_probe_trace_events(struct probe_t
struct probe_trace_event *new_tevs;
int ret = 0;
- if (ntevs == 0) {
+ if (*ntevs == 0) {
*tevs = *tevs2;
*ntevs = ntevs2;
*tevs2 = NULL;
Patches currently in stable-queue which might be from ravi.bangoria(a)linux.vnet.ibm.com are
queue-4.9/perf-probe-fix-concat_probe_trace_events.patch