perf/core: Fix scheduling regression of pinned groups

Vince Weaver reported: > I was tracking down some regressions in my perf_event_test testsuite. > Some of the tests broke in the 4.11-rc1 timeframe. > > I've bisected one of them, this report is about > tests/overflow/simul_oneshot_group_overflow > This test creates an event group containing two sampling events, set > to overflow to a signal handler (which disables and then refreshes the > event). > > On a good kernel you get the following: > Event perf::instructions with period 1000000 > Event perf::instructions with period 2000000 > fd 3 overflows: 946 (perf::instructions/1000000) > fd 4 overflows: 473 (perf::instructions/2000000) > Ending counts: > Count 0: 946379875 > Count 1: 946365218 > > With the broken kernels you get: > Event perf::instructions with period 1000000 > Event perf::instructions with period 2000000 > fd 3 overflows: 938 (perf::instructions/1000000) > fd 4 overflows: 318 (perf::instructions/2000000) > Ending counts: > Count 0: 946373080 > Count 1: 653373058 The root cause of the bug is that the following commit: 487f05e18a ("perf/core: Optimize event rescheduling on active contexts") erronously assumed that event's 'pinned' setting determines whether the event belongs to a pinned group or not, but in fact, it's the group leader's pinned state that matters. This was discovered by Vince in the test case described above, where two instruction counters are grouped, the group leader is pinned, but the other event is not; in the regressed case the counters were off by 33% (the difference between events' periods), but should be the same within the error margin. Fix the problem by looking at the group leader's pinning. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 487f05e18a ("perf/core: Optimize event rescheduling on active contexts") Link: http://lkml.kernel.org/r/87lgnmvw7h.fsf@ashishki-desk.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
author: Alexander Shishkin <alexander.shishkin@linux.intel.com> 2017-07-18 14:08:34 +0300
committer: Ingo Molnar <mingo@kernel.org> 2017-07-20 09:43:02 +0200
commit: 3bda69c1c3993a2bddbae01397d12bfef6054011 (patch)
tree: f68537445d68484381849c1f105e4bb69a51f9ea /kernel
parent: dc853e26f73e903e0c87e24f2695b5dcf33b3bc1 (diff)
download: lwn-3bda69c1c3993a2bddbae01397d12bfef6054011.tar.gz
lwn-3bda69c1c3993a2bddbae01397d12bfef6054011.zip
1 files changed, 7 insertions, 0 deletions
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9747e422ab20..c9cdbd396770 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1452,6 +1452,13 @@ static enum event_type_t get_event_type(struct perf_event *event)
 
 	lockdep_assert_held(&ctx->lock);
 
+	/*
+	 * It's 'group type', really, because if our group leader is
+	 * pinned, so are we.
+	 */
+	if (event->group_leader != event)
+		event = event->group_leader;
+
 	event_type = event->attr.pinned ? EVENT_PINNED : EVENT_FLEXIBLE;
 	if (!ctx->task)
 		event_type |= EVENT_CPU;
author	Alexander Shishkin <alexander.shishkin@linux.intel.com>	2017-07-18 14:08:34 +0300
committer	Ingo Molnar <mingo@kernel.org>	2017-07-20 09:43:02 +0200
commit	3bda69c1c3993a2bddbae01397d12bfef6054011 (patch)
tree	f68537445d68484381849c1f105e4bb69a51f9ea /kernel
parent	dc853e26f73e903e0c87e24f2695b5dcf33b3bc1 (diff)
download	lwn-3bda69c1c3993a2bddbae01397d12bfef6054011.tar.gz lwn-3bda69c1c3993a2bddbae01397d12bfef6054011.zip