Commit 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h"),
claimed L2 misses were unsupported, due to them not being found in its
referenced documentation, whose link has now moved [1].
That old documentation listed PMCx064 unit mask bit 3 as:
"LsRdBlkC: LS Read Block C S L X Change to X Miss."
and bit 0 as:
"IcFillMiss: IC Fill Miss"
We now have new public documentation [2] with improved descriptions, that
clearly indicate what events those unit mask bits represent:
Bit 3 now clearly states:
"LsRdBlkC: Data Cache Req Miss in L2 (all types)"
and bit 0 is:
"IcFillMiss: Instruction Cache Req Miss in L2."
So we can now add support for L2 misses in perf's genericised events as
PMCx064 with both the above unit masks.
[1] The commit's original documentation reference, "Processor Programming
Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors",
originally available here:
https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0…
is now available here:
https://developer.amd.com/wordpress/media/2017/11/54945_PPR_Family_17h_Mode…
[2] "Processor Programming Reference (PPR) for Family 17h Model 31h,
Revision B0 Processors", available here:
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Cc: Babu Moger <babu.moger(a)amd.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Frank van der Linden <fllinden(a)amazon.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Huang Rui <ray.huang(a)amd.com>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan(a)amd.com>
Cc: Jan Beulich <jbeulich(a)suse.com>
Cc: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Luwei Kang <luwei.kang(a)intel.com>
Cc: Martin Liška <mliska(a)suse.cz>
Cc: Matt Fleming <matt(a)codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit(a)amd.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: x86(a)kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Reported-by: Babu Moger <babu.moger(a)amd.com>
Tested-by: Babu Moger <babu.moger(a)amd.com>
Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h")
Signed-off-by: Kim Phillips <kim.phillips(a)amd.com>
---
RESENDing because I wasn't sure if the original version of this patch would get
ignored because it was sent with "[PATCH internal v2]" in the subject line:
https://lkml.org/lkml/2020/1/8/894
FWIW, I updated the Cc list to merge with those in patch 2/2 of this series.
arch/x86/events/amd/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 1f22b6bbda68..39eb276d0277 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -250,6 +250,7 @@ static const u64 amd_f17h_perfmon_event_map[PERF_COUNT_HW_MAX] =
[PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
[PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
[PERF_COUNT_HW_CACHE_REFERENCES] = 0xff60,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x0964,
[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
[PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x0287,
--
2.24.1
gcc -O3 warns that some local variables are not properly initialized:
drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_hang_notify':
drivers/scsi/fnic/vnic_dev.c:511:16: error: 'a0' is used uninitialized in this function [-Werror=uninitialized]
vdev->args[0] = *a0;
~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:691:6: note: 'a0' was declared here
u64 a0, a1;
^~
drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized]
vdev->args[1] = *a1;
~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:691:10: note: 'a1' was declared here
u64 a0, a1;
^~
drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_mac_addr':
drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized]
vdev->args[1] = *a1;
~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:698:10: note: 'a1' was declared here
u64 a0, a1;
^~
Apparently the code relies on the local variables occupying
adjacent memory locations in the same order, but this is of
course not guaranteed.
Use an array of two u64 variables where needed to make it work
correctly.
I suspect there is also an endianess bug here, but have not
digged in deep enough to be sure.
Cc: stable(a)vger.kernel.org
Fixes: 5df6d737dd4b ("[SCSI] fnic: Add new Cisco PCI-Express FCoE HBA")
Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
drivers/scsi/fnic/vnic_dev.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/scsi/fnic/vnic_dev.c b/drivers/scsi/fnic/vnic_dev.c
index 1f55b9e4e74a..1b88a3b53eee 100644
--- a/drivers/scsi/fnic/vnic_dev.c
+++ b/drivers/scsi/fnic/vnic_dev.c
@@ -688,26 +688,26 @@ int vnic_dev_soft_reset_done(struct vnic_dev *vdev, int *done)
int vnic_dev_hang_notify(struct vnic_dev *vdev)
{
- u64 a0, a1;
+ u64 a0 = 0, a1 = 0;
int wait = 1000;
return vnic_dev_cmd(vdev, CMD_HANG_NOTIFY, &a0, &a1, wait);
}
int vnic_dev_mac_addr(struct vnic_dev *vdev, u8 *mac_addr)
{
- u64 a0, a1;
+ u64 a[2] = {};
int wait = 1000;
int err, i;
for (i = 0; i < ETH_ALEN; i++)
mac_addr[i] = 0;
- err = vnic_dev_cmd(vdev, CMD_MAC_ADDR, &a0, &a1, wait);
+ err = vnic_dev_cmd(vdev, CMD_MAC_ADDR, &a[0], &a[1], wait);
if (err)
return err;
for (i = 0; i < ETH_ALEN; i++)
- mac_addr[i] = ((u8 *)&a0)[i];
+ mac_addr[i] = ((u8 *)&a)[i];
return 0;
}
@@ -732,30 +732,30 @@ void vnic_dev_packet_filter(struct vnic_dev *vdev, int directed, int multicast,
void vnic_dev_add_addr(struct vnic_dev *vdev, u8 *addr)
{
- u64 a0 = 0, a1 = 0;
+ u64 a[2] = {};
int wait = 1000;
int err;
int i;
for (i = 0; i < ETH_ALEN; i++)
- ((u8 *)&a0)[i] = addr[i];
+ ((u8 *)&a)[i] = addr[i];
- err = vnic_dev_cmd(vdev, CMD_ADDR_ADD, &a0, &a1, wait);
+ err = vnic_dev_cmd(vdev, CMD_ADDR_ADD, &a[0], &a[1], wait);
if (err)
pr_err("Can't add addr [%pM], %d\n", addr, err);
}
void vnic_dev_del_addr(struct vnic_dev *vdev, u8 *addr)
{
- u64 a0 = 0, a1 = 0;
+ u64 a[2] = {};
int wait = 1000;
int err;
int i;
for (i = 0; i < ETH_ALEN; i++)
- ((u8 *)&a0)[i] = addr[i];
+ ((u8 *)&a)[i] = addr[i];
- err = vnic_dev_cmd(vdev, CMD_ADDR_DEL, &a0, &a1, wait);
+ err = vnic_dev_cmd(vdev, CMD_ADDR_DEL, &a[0], &a[1], wait);
if (err)
pr_err("Can't del addr [%pM], %d\n", addr, err);
}
--
2.20.0
In year 2007 high performance SSD was still expensive, in order to
save more space for real workload or meta data, the readahead I/Os
for non-meta data was bypassed and not cached on SSD.
In now days, SSD price drops a lot and people can find larger size
SSD with more comfortable price. It is unncessary to bypass normal
readahead I/Os to save SSD space for now.
This patch removes the code which checks REQ_RAHEAD tag of bio in
check_should_bypass(), then all readahead I/Os will be cached on SSD.
NOTE: this patch still keeps the checking of "REQ_META|REQ_PRIO" in
should_writeback(), because we still want to cache meta data I/Os
even they are asynchronized.
Cc: stable(a)vger.kernel.org
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Eric Wheeler <bcache(a)linux.ewheeler.net>
---
drivers/md/bcache/request.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 73478a91a342..acc07c4f27ae 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -378,15 +378,6 @@ static bool check_should_bypass(struct cached_dev *dc, struct bio *bio)
op_is_write(bio_op(bio))))
goto skip;
- /*
- * Flag for bypass if the IO is for read-ahead or background,
- * unless the read-ahead request is for metadata
- * (eg, for gfs2 or xfs).
- */
- if (bio->bi_opf & (REQ_RAHEAD|REQ_BACKGROUND) &&
- !(bio->bi_opf & (REQ_META|REQ_PRIO)))
- goto skip;
-
if (bio->bi_iter.bi_sector & (c->sb.block_size - 1) ||
bio_sectors(bio) & (c->sb.block_size - 1)) {
pr_debug("skipping unaligned io");
--
2.16.4