<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/arch/s390/kernel/smp.c, branch docs-5.3-1</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.3-1</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.3-1'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2019-06-15T10:25:52+00:00</updated>
<entry>
<title>s390: improve wait logic of stop_machine</title>
<updated>2019-06-15T10:25:52+00:00</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2019-05-17T10:50:42+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=38f2c691a4b3e89d476f8e8350d1ca299974b89d'/>
<id>urn:sha1:38f2c691a4b3e89d476f8e8350d1ca299974b89d</id>
<content type='text'>
The stop_machine loop to advance the state machine and to wait for all
affected CPUs to check-in calls cpu_relax_yield in a tight loop until
the last missing CPUs acknowledged the state transition.

On a virtual system where not all logical CPUs are backed by real CPUs
all the time it can take a while for all CPUs to check-in. With the
current definition of cpu_relax_yield a diagnose 0x44 is done which
tells the hypervisor to schedule *some* other CPU. That can be any
CPU and not necessarily one of the CPUs that need to run in order to
advance the state machine. This can lead to a pretty bad diagnose 0x44
storm until the last missing CPU finally checked-in.

Replace the undirected cpu_relax_yield based on diagnose 0x44 with a
directed yield. Each CPU in the wait loop will pick up the next CPU
in the cpumask of stop_machine. The diagnose 0x9c is used to tell the
hypervisor to run this next CPU instead of the current one. If there
is only a limited number of real CPUs backing the virtual CPUs we
end up with the real CPUs passed around in a round-robin fashion.

[heiko.carstens@de.ibm.com]:
    Use cpumask_next_wrap as suggested by Peter Zijlstra.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>s390: enforce CONFIG_HOTPLUG_CPU</title>
<updated>2019-06-07T08:09:42+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2019-06-03T13:22:25+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=3e8eb22faee179798530b6a3d2639629fcf9d580'/>
<id>urn:sha1:3e8eb22faee179798530b6a3d2639629fcf9d580</id>
<content type='text'>
x86 and powerpc (partially) enforce already CONFIG_HOTPLUG_CPU. On
s390 it is enabled on all distributions by default since ages.
The only exception is our zfcpdump kernel.

However to simplify testing, enforce HOTPLUG_CPU. This was suggested
by Paul McKenney, since his rcutorture test environments for CONFIG_SMP=y
only support HOTPLUG_CPU=y.

Suggested-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>s390/unwind: introduce stack unwind API</title>
<updated>2019-05-02T11:54:11+00:00</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2019-01-28T07:33:08+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=78c98f9074135d3dab4e39544e0a537f92388fce'/>
<id>urn:sha1:78c98f9074135d3dab4e39544e0a537f92388fce</id>
<content type='text'>
Rework the dump_trace() stack unwinder interface to support different
unwinding algorithms. The new interface looks like this:

	struct unwind_state state;
	unwind_for_each_frame(&amp;state, task, regs, start_stack)
		do_something(state.sp, state.ip, state.reliable);

The unwind_bc.c file contains the implementation for the classic
back-chain unwinder.

One positive side effect of the new code is it now handles ftraced
functions gracefully. It prints the real name of the return function
instead of 'return_to_handler'.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>s390/kernel: introduce .dma sections</title>
<updated>2019-04-29T08:47:10+00:00</updated>
<author>
<name>Gerald Schaefer</name>
<email>gerald.schaefer@de.ibm.com</email>
</author>
<published>2019-02-03T20:37:20+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a80313ff91abda67641dc33bed97f6bcc5e9f6a4'/>
<id>urn:sha1:a80313ff91abda67641dc33bed97f6bcc5e9f6a4</id>
<content type='text'>
With a relocatable kernel that could reside at any place in memory, code
and data that has to stay below 2 GB needs special handling.

This patch introduces .dma sections for such text, data and ex_table.
The sections will be part of the decompressor kernel, so they will not
be relocated and stay below 2 GB. Their location is passed over to the
decompressed / relocated kernel via the .boot.preserved.data section.

The duald and aste for control register setup also need to stay below
2 GB, so move the setup code from arch/s390/kernel/head64.S to
arch/s390/boot/head.S. The duct and linkage_stack could reside above
2 GB, but their content has to be preserved for the decompresed kernel,
so they are also moved into the .dma section.

The start and end address of the .dma sections is added to vmcoreinfo,
for crash support, to help debugging in case the kernel crashed there.

Signed-off-by: Gerald Schaefer &lt;gerald.schaefer@de.ibm.com&gt;
Reviewed-by: Philipp Rudo &lt;prudo@linux.ibm.com&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 's390-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux</title>
<updated>2019-03-28T15:35:32+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-28T15:35:32+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=bfed6d0ffc8dba002312c2641c00ecd3bf9f9cbf'/>
<id>urn:sha1:bfed6d0ffc8dba002312c2641c00ecd3bf9f9cbf</id>
<content type='text'>
Pull s390 fixes from Martin Schwidefsky:
 "Improvements and bug fixes for 5.1-rc2:

   - Fix early free of the channel program in vfio

   - On AP device removal make sure that all messages are flushed with
     the driver still attached that queued the message

   - Limit brk randomization to 32MB to reduce the chance that the heap
     of ld.so is placed after the main stack

   - Add a rolling average for the steal time of a CPU, this will be
     needed for KVM to decide when to do busy waiting

   - Fix a warning in the CPU-MF code

   - Add a notification handler for AP configuration change to react
     faster to new AP devices"

* tag 's390-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/cpumf: Fix warning from check_processor_id
  zcrypt: handle AP Info notification from CHSC SEI command
  vfio: ccw: only free cp on final interrupt
  s390/vtime: steal time exponential moving average
  s390/zcrypt: revisit ap device remove procedure
  s390: limit brk randomization to 32MB
</content>
</entry>
<entry>
<title>treewide: add checks for the return value of memblock_alloc*()</title>
<updated>2019-03-12T17:04:02+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.ibm.com</email>
</author>
<published>2019-03-12T06:30:31+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=8a7f97b902f4fb0d94b355b6b3f1fbd7154cafb9'/>
<id>urn:sha1:8a7f97b902f4fb0d94b355b6b3f1fbd7154cafb9</id>
<content type='text'>
Add check for the return value of memblock_alloc*() functions and call
panic() in case of error.  The panic message repeats the one used by
panicing memblock allocators with adjustment of parameters to include
only relevant ones.

The replacement was mostly automated with semantic patches like the one
below with manual massaging of format strings.

  @@
  expression ptr, size, align;
  @@
  ptr = memblock_alloc(size, align);
  + if (!ptr)
  + 	panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);

[anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
  Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
[rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
  Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
[rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
  Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
[akpm@linux-foundation.org: fix xtensa printk warning]
Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Anders Roxell &lt;anders.roxell@linaro.org&gt;
Reviewed-by: Guo Ren &lt;ren_guo@c-sky.com&gt;		[c-sky]
Acked-by: Paul Burton &lt;paul.burton@mips.com&gt;		[MIPS]
Acked-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;	[s390]
Reviewed-by: Juergen Gross &lt;jgross@suse.com&gt;		[Xen]
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;	[m68k]
Acked-by: Max Filippov &lt;jcmvbkbc@gmail.com&gt;		[xtensa]
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Rob Herring &lt;robh+dt@kernel.org&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memblock: drop memblock_alloc_base()</title>
<updated>2019-03-12T17:04:01+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.ibm.com</email>
</author>
<published>2019-03-12T06:29:35+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=0ba9e6edd4c2e563a9b34c8a46649218814a363f'/>
<id>urn:sha1:0ba9e6edd4c2e563a9b34c8a46649218814a363f</id>
<content type='text'>
The memblock_alloc_base() function tries to allocate a memory up to the
limit specified by its max_addr parameter and panics if the allocation
fails.  Replace its usage with memblock_phys_alloc_range() and make the
callers check the return value and panic in case of error.

Link: http://lkml.kernel.org/r/1548057848-15136-10-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;		[powerpc]
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Guo Ren &lt;ren_guo@c-sky.com&gt;				[c-sky]
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;			[Xen]
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Rob Herring &lt;robh+dt@kernel.org&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>s390/vtime: steal time exponential moving average</title>
<updated>2019-03-06T13:59:50+00:00</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2019-03-06T11:31:21+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=152e9b8676c6e788c6bff095c1eaae7b86df5003'/>
<id>urn:sha1:152e9b8676c6e788c6bff095c1eaae7b86df5003</id>
<content type='text'>
To be able to judge the current overcommitment ratio for a CPU add
a lowcore field with the exponential moving average of the steal time.
The average is updated every tick.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU</title>
<updated>2019-01-11T16:12:03+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2019-01-11T14:18:22+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=60f1bf29c0b2519989927cae640cd1f50f59dc7f'/>
<id>urn:sha1:60f1bf29c0b2519989927cae640cd1f50f59dc7f</id>
<content type='text'>
When calling smp_call_ipl_cpu() from the IPL CPU, we will try to read
from pcpu_devices-&gt;lowcore. However, due to prefixing, that will result
in reading from absolute address 0 on that CPU. We have to go via the
actual lowcore instead.

This means that right now, we will read lc-&gt;nodat_stack == 0 and
therfore work on a very wrong stack.

This BUG essentially broke rebooting under QEMU TCG (which will report
a low address protection exception). And checking under KVM, it is
also broken under KVM. With 1 VCPU it can be easily triggered.

:/# echo 1 &gt; /proc/sys/kernel/sysrq
:/# echo b &gt; /proc/sysrq-trigger
[   28.476745] sysrq: SysRq : Resetting
[   28.476793] Kernel stack overflow.
[   28.476817] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
[   28.476820] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
[   28.476826] Krnl PSW : 0400c00180000000 0000000000115c0c (pcpu_delegate+0x12c/0x140)
[   28.476861]            R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[   28.476863] Krnl GPRS: ffffffffffffffff 0000000000000000 000000000010dff8 0000000000000000
[   28.476864]            0000000000000000 0000000000000000 0000000000ab7090 000003e0006efbf0
[   28.476864]            000000000010dff8 0000000000000000 0000000000000000 0000000000000000
[   28.476865]            000000007fffc000 0000000000730408 000003e0006efc58 0000000000000000
[   28.476887] Krnl Code: 0000000000115bfe: 4170f000            la      %r7,0(%r15)
[   28.476887]            0000000000115c02: 41f0a000            la      %r15,0(%r10)
[   28.476887]           #0000000000115c06: e370f0980024        stg     %r7,152(%r15)
[   28.476887]           &gt;0000000000115c0c: c0e5fffff86e        brasl   %r14,114ce8
[   28.476887]            0000000000115c12: 41f07000            la      %r15,0(%r7)
[   28.476887]            0000000000115c16: a7f4ffa8            brc     15,115b66
[   28.476887]            0000000000115c1a: 0707                bcr     0,%r7
[   28.476887]            0000000000115c1c: 0707                bcr     0,%r7
[   28.476901] Call Trace:
[   28.476902] Last Breaking-Event-Address:
[   28.476920]  [&lt;0000000000a01c4a&gt;] arch_call_rest_init+0x22/0x80
[   28.476927] Kernel panic - not syncing: Corrupt kernel stack, can't continue.
[   28.476930] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
[   28.476932] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
[   28.476932] Call Trace:

Fixes: 2f859d0dad81 ("s390/smp: reduce size of struct pcpu")
Cc: stable@vger.kernel.org # 4.0+
Reported-by: Cornelia Huck &lt;cohuck@redhat.com&gt;
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>s390/smp: fix CPU hotplug deadlock with CPU rescan</title>
<updated>2019-01-11T16:12:02+00:00</updated>
<author>
<name>Gerald Schaefer</name>
<email>gerald.schaefer@de.ibm.com</email>
</author>
<published>2019-01-09T12:00:03+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=b7cb707c373094ce4008d4a6ac9b6b366ec52da5'/>
<id>urn:sha1:b7cb707c373094ce4008d4a6ac9b6b366ec52da5</id>
<content type='text'>
smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
to a dedlock when a new CPU is found and immediately set online by a udev
rule.

This was observed on an older kernel version, where the cpu_hotplug_begin()
loop was still present, and it resulted in hanging chcpu and systemd-udev
processes. This specific deadlock will not show on current kernels. However,
there may be other possible deadlocks, and since smp_rescan_cpus() can still
trigger a CPU hotplug operation, the device_hotplug_lock should be held.

For reference, this was the deadlock with the old cpu_hotplug_begin() loop:

        chcpu (rescan)                       systemd-udevd

 echo 1 &gt; /sys/../rescan
 -&gt; smp_rescan_cpus()
 -&gt; (*) get_online_cpus()
    (increases refcount)
 -&gt; smp_add_present_cpu()
    (new CPU found)
 -&gt; register_cpu()
 -&gt; device_add()
 -&gt; udev "add" event triggered -----------&gt; udev rule sets CPU online
                                         -&gt; echo 1 &gt; /sys/.../online
                                         -&gt; lock_device_hotplug_sysfs()
                                            (this is missing in rescan path)
                                         -&gt; device_online()
                                         -&gt; (**) device_lock(new CPU dev)
                                         -&gt; cpu_up()
                                         -&gt; cpu_hotplug_begin()
                                            (loops until refcount == 0)
                                            -&gt; deadlock with (*)
 -&gt; bus_probe_device()
 -&gt; device_attach()
 -&gt; device_lock(new CPU dev)
    -&gt; deadlock with (**)

Fix this by taking the device_hotplug_lock in the CPU rescan path.

Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Gerald Schaefer &lt;gerald.schaefer@de.ibm.com&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
</feed>
