<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/kernel, branch docs-6.5-2</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-6.5-2</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-6.5-2'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2023-05-14T15:00:46+00:00</updated>
<entry>
<title>Merge tag 'locking_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-05-14T15:00:46+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-05-14T15:00:46+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=31f4104e392a10327f90006a61fcf1d4e7a79472'/>
<id>urn:sha1:31f4104e392a10327f90006a61fcf1d4e7a79472</id>
<content type='text'>
Pull locking fix from Borislav Petkov:

 - Make sure __down_read_common() is always inlined so that the callers'
   names land in traceevents output and thus the blocked function can be
   identified

* tag 'locking_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Add __always_inline annotation to __down_read_common() and inlined callers
</content>
</entry>
<entry>
<title>Merge tag 'perf_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-05-14T14:56:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-05-14T14:56:51+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=ef21831c2e4e66cb948d5107f47c1aa0a5711a56'/>
<id>urn:sha1:ef21831c2e4e66cb948d5107f47c1aa0a5711a56</id>
<content type='text'>
Pull perf fixes from Borislav Petkov:

 - Make sure the PEBS buffer is flushed before reprogramming the
   hardware so that the correct record sizes are used

 - Update the sample size for AMD BRS events

 - Fix a confusion with using the same on-stack struct with different
   events in the event processing path

* tag 'perf_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG
  perf/x86: Fix missing sample size update on AMD BRS
  perf/core: Fix perf_sample_data not properly initialized for different swevents in perf_tp_event()
</content>
</entry>
<entry>
<title>Merge tag 'sched_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-05-14T14:50:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-05-14T14:50:34+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=f3b9e8e4c84845d09dbba49f664fc8407f18a3dc'/>
<id>urn:sha1:f3b9e8e4c84845d09dbba49f664fc8407f18a3dc</id>
<content type='text'>
Pull scheduler fix from Borislav Petkov:

 - Fix a couple of kernel-doc warnings

* tag 'sched_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: fix cid_lock kernel-doc warnings
</content>
</entry>
<entry>
<title>tick/broadcast: Make broadcast device replacement work correctly</title>
<updated>2023-05-08T21:18:16+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2023-05-06T16:40:57+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=f9d36cf445ffff0b913ba187a3eff78028f9b1fb'/>
<id>urn:sha1:f9d36cf445ffff0b913ba187a3eff78028f9b1fb</id>
<content type='text'>
When a tick broadcast clockevent device is initialized for one shot mode
then tick_broadcast_setup_oneshot() OR's the periodic broadcast mode
cpumask into the oneshot broadcast cpumask.

This is required when switching from periodic broadcast mode to oneshot
broadcast mode to ensure that CPUs which are waiting for periodic
broadcast are woken up on the next tick.

But it is subtly broken, when an active broadcast device is replaced and
the system is already in oneshot (NOHZ/HIGHRES) mode. Victor observed
this and debugged the issue.

Then the OR of the periodic broadcast CPU mask is wrong as the periodic
cpumask bits are sticky after tick_broadcast_enable() set it for a CPU
unless explicitly cleared via tick_broadcast_disable().

That means that this sets all other CPUs which have tick broadcasting
enabled at that point unconditionally in the oneshot broadcast mask.

If the affected CPUs were already idle and had their bits set in the
oneshot broadcast mask then this does no harm. But for non idle CPUs
which were not set this corrupts their state.

On their next invocation of tick_broadcast_enable() they observe the bit
set, which indicates that the broadcast for the CPU is already set up.
As a consequence they fail to update the broadcast event even if their
earliest expiring timer is before the actually programmed broadcast
event.

If the programmed broadcast event is far in the future, then this can
cause stalls or trigger the hung task detector.

Avoid this by telling tick_broadcast_setup_oneshot() explicitly whether
this is the initial switch over from periodic to oneshot broadcast which
must take the periodic broadcast mask into account. In the case of
initialization of a replacement device this prevents that the broadcast
oneshot mask is modified.

There is a second problem with broadcast device replacement in this
function. The broadcast device is only armed when the previous state of
the device was periodic.

That is correct for the switch from periodic broadcast mode to oneshot
broadcast mode as the underlying broadcast device could operate in
oneshot state already due to lack of periodic state in hardware. In that
case it is already armed to expire at the next tick.

For the replacement case this is wrong as the device is in shutdown
state. That means that any already pending broadcast event will not be
armed.

This went unnoticed because any CPU which goes idle will observe that
the broadcast device has an expiry time of KTIME_MAX and therefore any
CPUs next timer event will be earlier and cause a reprogramming of the
broadcast device. But that does not guarantee that the events of the
CPUs which were already in idle are delivered on time.

Fix this by arming the newly installed device for an immediate event
which will reevaluate the per CPU expiry times and reprogram the
broadcast device accordingly. This is simpler than caching the last
expiry time in yet another place or saving it before the device exchange
and handing it down to the setup function. Replacement of broadcast
devices is not a frequent operation and usually happens once somewhere
late in the boot process.

Fixes: 9c336c9935cf ("tick/broadcast: Allow late registered device to enter oneshot mode")
Reported-by: Victor Hassan &lt;victor@allwinnertech.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Link: https://lore.kernel.org/r/87pm7d2z1i.ffs@tglx

</content>
</entry>
<entry>
<title>sched: fix cid_lock kernel-doc warnings</title>
<updated>2023-05-08T08:58:28+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@infradead.org</email>
</author>
<published>2023-04-28T03:11:11+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=0019a2d4b7e37a983d133d42b707b8a3018ae6f4'/>
<id>urn:sha1:0019a2d4b7e37a983d133d42b707b8a3018ae6f4</id>
<content type='text'>
Fix kernel-doc warnings for cid_lock and use_cid_lock.
These comments are not in kernel-doc format.

kernel/sched/core.c:11496: warning: Cannot understand  * @cid_lock: Guarantee forward-progress of cid allocation.
 on line 11496 - I thought it was a doc line
kernel/sched/core.c:11505: warning: Cannot understand  * @use_cid_lock: Select cid allocation behavior: lock-free vs spinlock.
 on line 11505 - I thought it was a doc line

Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by mm_cid")
Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20230428031111.322-1-rdunlap@infradead.org
</content>
</entry>
<entry>
<title>perf/core: Fix perf_sample_data not properly initialized for different swevents in perf_tp_event()</title>
<updated>2023-05-08T08:58:26+00:00</updated>
<author>
<name>Yang Jihong</name>
<email>yangjihong1@huawei.com</email>
</author>
<published>2023-04-25T10:32:17+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=1d1bfe30dad50d4bea83cd38d73c441972ea0173'/>
<id>urn:sha1:1d1bfe30dad50d4bea83cd38d73c441972ea0173</id>
<content type='text'>
data-&gt;sample_flags may be modified in perf_prepare_sample(),
in perf_tp_event(), different swevents use the same on-stack
perf_sample_data, the previous swevent may change sample_flags in
perf_prepare_sample(), as a result, some members of perf_sample_data are
not correctly initialized when next swevent_event preparing sample
(for example data-&gt;id, the value varies according to swevent).

A simple scenario triggers this problem is as follows:

  # perf record -e sched:sched_switch --switch-output-event sched:sched_switch -a sleep 1
  [ perf record: dump data: Woken up 0 times ]
  [ perf record: Dump perf.data.2023041209014396 ]
  [ perf record: dump data: Woken up 0 times ]
  [ perf record: Dump perf.data.2023041209014662 ]
  [ perf record: dump data: Woken up 0 times ]
  [ perf record: Dump perf.data.2023041209014910 ]
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Dump perf.data.2023041209015164 ]
  [ perf record: Captured and wrote 0.069 MB perf.data.&lt;timestamp&gt; ]
  # ls -l
  total 860
  -rw------- 1 root root  95694 Apr 12 09:01 perf.data.2023041209014396
  -rw------- 1 root root 606430 Apr 12 09:01 perf.data.2023041209014662
  -rw------- 1 root root  82246 Apr 12 09:01 perf.data.2023041209014910
  -rw------- 1 root root  82342 Apr 12 09:01 perf.data.2023041209015164
  # perf script -i perf.data.2023041209014396
  0x11d58 [0x80]: failed to process type: 9 [Bad address]

Solution: Re-initialize perf_sample_data after each event is processed.
Note that data-&gt;raw-&gt;frag.data may be accessed in perf_tp_event_match().
Therefore, need to init sample_data and then go through swevent hlist to prevent
reference of NULL pointer, reported by [1].

After fix:

  # perf record -e sched:sched_switch --switch-output-event sched:sched_switch -a sleep 1
  [ perf record: dump data: Woken up 0 times ]
  [ perf record: Dump perf.data.2023041209442259 ]
  [ perf record: dump data: Woken up 0 times ]
  [ perf record: Dump perf.data.2023041209442514 ]
  [ perf record: dump data: Woken up 0 times ]
  [ perf record: Dump perf.data.2023041209442760 ]
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Dump perf.data.2023041209443003 ]
  [ perf record: Captured and wrote 0.069 MB perf.data.&lt;timestamp&gt; ]
  # ls -l
  total 864
  -rw------- 1 root root 100166 Apr 12 09:44 perf.data.2023041209442259
  -rw------- 1 root root 606438 Apr 12 09:44 perf.data.2023041209442514
  -rw------- 1 root root  82246 Apr 12 09:44 perf.data.2023041209442760
  -rw------- 1 root root  82342 Apr 12 09:44 perf.data.2023041209443003
  # perf script -i perf.data.2023041209442259 | head -n 5
              perf   232 [000]    66.846217: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=D ==&gt; next_comm=perf next_pid=234 next_prio=120
              perf   234 [000]    66.846449: sched:sched_switch: prev_comm=perf prev_pid=234 prev_prio=120 prev_state=S ==&gt; next_comm=perf next_pid=232 next_prio=120
              perf   232 [000]    66.846546: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=R ==&gt; next_comm=perf next_pid=234 next_prio=120
              perf   234 [000]    66.846606: sched:sched_switch: prev_comm=perf prev_pid=234 prev_prio=120 prev_state=S ==&gt; next_comm=perf next_pid=232 next_prio=120
              perf   232 [000]    66.846646: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=R ==&gt; next_comm=perf next_pid=234 next_prio=120

[1] Link: https://lore.kernel.org/oe-lkp/202304250929.efef2caa-yujie.liu@intel.com

Fixes: bb447c27a467 ("perf/core: Set data-&gt;sample_flags in perf_prepare_sample()")
Signed-off-by: Yang Jihong &lt;yangjihong1@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20230425103217.130600-1-yangjihong1@huawei.com
</content>
</entry>
<entry>
<title>locking/rwsem: Add __always_inline annotation to __down_read_common() and inlined callers</title>
<updated>2023-05-08T08:58:24+00:00</updated>
<author>
<name>John Stultz</name>
<email>jstultz@google.com</email>
</author>
<published>2023-05-03T02:33:51+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=92cc5d00a431e96e5a49c0b97e5ad4fa7536bd4b'/>
<id>urn:sha1:92cc5d00a431e96e5a49c0b97e5ad4fa7536bd4b</id>
<content type='text'>
Apparently despite it being marked inline, the compiler
may not inline __down_read_common() which makes it difficult
to identify the cause of lock contention, as the blocked
function in traceevents will always be listed as
__down_read_common().

So this patch adds __always_inline annotation to the common
function (as well as the inlined helper callers) to force it to
be inlined so the blocking function will be listed (via Wchan)
in traceevents.

Fixes: c995e638ccbb ("locking/rwsem: Fold __down_{read,write}*()")
Reported-by: Tim Murray &lt;timmurray@google.com&gt;
Signed-off-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20230503023351.2832796-1-jstultz@google.com
</content>
</entry>
<entry>
<title>Merge tag 'trace-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace</title>
<updated>2023-05-05T20:11:02+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-05-05T20:11:02+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=e919a3f7057b5ca918dea98826b39a38eff9bebb'/>
<id>urn:sha1:e919a3f7057b5ca918dea98826b39a38eff9bebb</id>
<content type='text'>
Pull more tracing updates from Steven Rostedt:

 - Make buffer_percent read/write.

   The buffer_percent file is how users can state how long to block on
   the tracing buffer depending on how much is in the buffer. When it
   hits the "buffer_percent" it will wake the task waiting on the
   buffer. For some reason it was set to read-only.

   This was not noticed because testing was done as root without
   SELinux, but with SELinux it will prevent even root to write to it
   without having CAP_DAC_OVERRIDE.

 - The "touched_functions" was added this merge window, but one of the
   reasons for adding it was not implemented.

   That was to show what functions were not only touched, but had either
   a direct trampoline attached to it, or a kprobe or live kernel
   patching that can "hijack" the function to run a different function.
   The point is to know if there's functions in the kernel that may not
   be behaving as the kernel code shows. This can be used for debugging.

   TODO: Add this information to kernel oops too.

* tag 'trace-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ftrace: Add MODIFIED flag to show if IPMODIFY or direct was attached
  tracing: Fix permissions for the buffer_percent file
</content>
</entry>
<entry>
<title>Merge tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-05-05T19:56:55+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-05-05T19:56:55+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=b115d85a9584c98f9a7dec209d835462aa1adc09'/>
<id>urn:sha1:b115d85a9584c98f9a7dec209d835462aa1adc09</id>
<content type='text'>
Pull locking updates from Ingo Molnar:

 - Introduce local{,64}_try_cmpxchg() - a slightly more optimal
   primitive, which will be used in perf events ring-buffer code

 - Simplify/modify rwsems on PREEMPT_RT, to address writer starvation

 - Misc cleanups/fixes

* tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomic: Correct (cmp)xchg() instrumentation
  locking/x86: Define arch_try_cmpxchg_local()
  locking/arch: Wire up local_try_cmpxchg()
  locking/generic: Wire up local{,64}_try_cmpxchg()
  locking/atomic: Add generic try_cmpxchg{,64}_local() support
  locking/rwbase: Mitigate indefinite writer starvation
  locking/arch: Rename all internal __xchg() names to __arch_xchg()
</content>
</entry>
<entry>
<title>ftrace: Add MODIFIED flag to show if IPMODIFY or direct was attached</title>
<updated>2023-05-05T15:09:25+00:00</updated>
<author>
<name>Steven Rostedt (Google)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2023-05-03T01:32:33+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=6ce2c04fcbcaa5eb086e5142ab359be306cbc3e1'/>
<id>urn:sha1:6ce2c04fcbcaa5eb086e5142ab359be306cbc3e1</id>
<content type='text'>
If a function had ever had IPMODIFY or DIRECT attached to it, where this
is how live kernel patching and BPF overrides work, mark them and display
an "M" in the enabled_functions and touched_functions files. This can be
used for debugging. If a function had been modified and later there's a bug
in the code related to that function, this can be used to know if the cause
is possibly from a live kernel patch or a BPF program that changed the
behavior of the code.

Also update the documentation on the enabled_functions and
touched_functions output, as it was missing direct callers and CALL_OPS.
And include this new modify attribute.

Link: https://lore.kernel.org/linux-trace-kernel/20230502213233.004e3ae4@gandalf.local.home

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
</feed>
