summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMatt Fleming <matt@codeblueprint.co.uk>2016-08-24 14:12:08 +0100
committerBen Hutchings <ben@decadent.org.uk>2016-11-20 01:17:30 +0000
commit9371773db47affedd29a857e8707f16562cc94b4 (patch)
tree87c1b07cb1b7259fd4984c82eec4135983ead697
parent755b0ecbe124df0fa5a58032918d30231b337615 (diff)
downloadlwn-9371773db47affedd29a857e8707f16562cc94b4.tar.gz
lwn-9371773db47affedd29a857e8707f16562cc94b4.zip
perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2
commit 080fe0b790ad438fc1b61621dac37c1964ce7f35 upstream. While the Intel PMU monitors the LLC when perf enables the HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor L1 instruction cache fetches (0x0080) and instruction cache misses (0x0081) on the AMD PMU. This is extremely confusing when monitoring the same workload across Intel and AMD machines, since parameters like, $ perf stat -e cache-references,cache-misses measure completely different things. Instead, make the AMD PMU measure instruction/data cache and TLB fill requests to the L2 and instruction/data cache and TLB misses in the L2 when HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled, respectively. That way the events measure unified caches on both platforms. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16: - Drop KVM PMU changes - Adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-rw-r--r--arch/x86/kernel/cpu/perf_event_amd.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index beeb7cc07044..dc4c3b89c3f5 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -119,8 +119,8 @@ static const u64 amd_perfmon_event_map[] =
{
[PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
[PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
- [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080,
- [PERF_COUNT_HW_CACHE_MISSES] = 0x0081,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x077e,
[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
[PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */