summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-08-27powerpc/64: optimise LOAD_REG_IMMEDIATE_SYM()Christophe Leroy
Optimise LOAD_REG_IMMEDIATE_SYM() using a temporary register to parallelise operations. It reduces the path from 5 to 3 instructions. Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bad41ed02531bb0382420cbab50a0d7153b71767.1566311636.git.christophe.leroy@c-s.fr
2019-08-27powerpc/32: replace LOAD_MSR_KERNEL() by LOAD_REG_IMMEDIATE()Christophe Leroy
LOAD_MSR_KERNEL() and LOAD_REG_IMMEDIATE() are doing the same thing in the same way. Drop LOAD_MSR_KERNEL() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8f04a6df0bc8949517fd8236d50c15008ccf9231.1566311636.git.christophe.leroy@c-s.fr
2019-08-27powerpc: rewrite LOAD_REG_IMMEDIATE() as an intelligent macroChristophe Leroy
Today LOAD_REG_IMMEDIATE() is a basic #define which loads all parts on a value into a register, including the parts that are NUL. This means always 2 instructions on PPC32 and always 5 instructions on PPC64. And those instructions cannot run in parallele as they are updating the same register. Ex: LOAD_REG_IMMEDIATE(r1,THREAD_SIZE) in head_64.S results in: 3c 20 00 00 lis r1,0 60 21 00 00 ori r1,r1,0 78 21 07 c6 rldicr r1,r1,32,31 64 21 00 00 oris r1,r1,0 60 21 40 00 ori r1,r1,16384 Rewrite LOAD_REG_IMMEDIATE() with GAS macro in order to skip the parts that are NUL. Rename existing LOAD_REG_IMMEDIATE() as LOAD_REG_IMMEDIATE_SYM() and use that one for loading value of symbols which are not known at compile time. Now LOAD_REG_IMMEDIATE(r1,THREAD_SIZE) in head_64.S results in: 38 20 40 00 li r1,16384 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d60ce8dd3a383c7adbfc322bf1d53d81724a6000.1566311636.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: split out early ioremap path.Christophe Leroy
ioremap does things differently depending on whether SLAB is available or not at different levels. Try to separate the early path from the beginning. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3acd2dbe04b04f111475e7a59f2b6f2ab9b95ab6.1566309263.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: refactor ioremap vm area setup.Christophe Leroy
PPC32 and PPC64 are doing the same once SLAB is available. Create a do_ioremap() function that calls get_vm_area and do the mapping. For PPC64, we add the 4K PFN hack sanity check to __ioremap_caller() in order to avoid using __ioremap_at(). Other checks in __ioremap_at() are irrelevant for __ioremap_caller(). On PPC64, VM area is allocated in the range [ioremap_bot ; IOREMAP_END] On PPC32, VM area is allocated in the range [VMALLOC_START ; VMALLOC_END] Lets define IOREMAP_START is ioremap_bot for PPC64, and alias IOREMAP_START/END to VMALLOC_START/END on PPC32 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/42e7e36ad32e0fdf76692426cc642799c9f689b8.1566309263.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: refactor ioremap_range() and use ioremap_page_range()Christophe Leroy
book3s64's ioremap_range() is almost same as fallback ioremap_range(), except that it calls radix__ioremap_range() when radix is enabled. radix__ioremap_range() is also very similar to the other ones, expect that it calls ioremap_page_range when slab is available. PPC32 __ioremap_caller() have a loop doing the same thing as ioremap_range() so use it on PPC32 as well. Lets keep only one version of ioremap_range() which calls ioremap_page_range() on all platforms when slab is available. At the same time, drop the nid parameter which is not used. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4b1dca7096b01823b101be7338983578641547f1.1566309263.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: Move ioremap functions out of pgtable_32/64.cChristophe Leroy
Create ioremap_32.c and ioremap_64.c and move respective ioremap functions out of pgtable_32.c and pgtable_64.c In the meantime, fix a few comments and changes a printk() to pr_warn(). Also fix a few oversplitted lines. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b5c8b02ccefd4ede64c61b53cf64fb5dacb35740.1566309263.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: make ioremap_bot common to allChristophe Leroy
Drop multiple definitions of ioremap_bot and make one common to all subarches. Only CONFIG_PPC_BOOK3E_64 had a global static init value for ioremap_bot. Now ioremap_bot is set in early_init_mmu_global(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/920eebfd9f36f14c79d1755847f5bf7c83703bdd.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: move ioremap_prot() into ioremap.cChristophe Leroy
Both ioremap_prot() are idenfical, move them into ioremap.c Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0b3eb0e0f1490a99fd6c983e166fb8946233f151.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: move common 32/64 bits ioremap functions into ioremap.cChristophe Leroy
ioremap(), ioremap_wc() and ioremap_coherent() are now identical on PPC32 and PPC64 as iowa_is_active() will always return false on PPC32. Move them into a new common location called ioremap.c Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6223803ce024d6ab4dfaa919f44098aed5b4bc33.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: rework io-workaround invocation.Christophe Leroy
ppc_md.ioremap() is only used for I/O workaround on CELL platform, so indirect function call can be avoided. This patch reworks the io-workaround and ioremap() functions to use the global 'io_workaround_inited' flag for the activation of io-workaround. When CONFIG_PPC_IO_WORKAROUNDS or CONFIG_PPC_INDIRECT_MMIO are not selected, the I/O workaround ioremap() voids and the global flag is not used. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5fa3ef069fbd0f152512afaae19e7a60161454cf.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: drop function __ioremap()Christophe Leroy
__ioremap() is not used anymore, drop it. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ccc439f481a0884e00a6be1bab44bab2a4477fea.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc/mm: drop ppc_md.iounmap() and __iounmap()Christophe Leroy
ppc_md.iounmap() is never set, drop it. Once ppc_md.iounmap() is gone, iounmap() remains the only user of __iounmap() and iounmap() does nothing else than calling __iounmap(). So drop iounmap() and make __iounmap() the new iounmap(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d73ba92bb7a387cc58cc34666d7f5158a45851b0.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc/ps3: replace __ioremap() by ioremap_prot()Christophe Leroy
__ioremap() is similar to ioremap_prot() except that ioremap_prot() does a few sanity changes in addition. The flags used by PS3 are not impacted by those changes so for PS3 both functions are equivalent. At the same time, drop parts of the comment that have been invalid since commit e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO") Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/36bff5d875ff562889c5e12dab63e5d7c5d1fbd8.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc: remove the ppc44x ocm.c fileChristoph Hellwig
The on chip memory allocator is entirely unused in the kernel tree. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7b1668941ad1041d08b19167030868de5840b153.1566309262.git.christophe.leroy@c-s.fr
2019-08-27powerpc/64: don't select ARCH_HAS_SCALED_CPUTIME on book3EChristophe Leroy
Book3E doesn't have SPRN_SPURR/SPRN_PURR. Activating ARCH_HAS_SCALED_CPUTIME is just wasting CPU time. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/linuxppc/issues/issues/171 Link: https://lore.kernel.org/r/a8b567c569aa521a7cf1beb061d43d79070e850c.1566492229.git.christophe.leroy@c-s.fr
2019-08-27powerpc/64s: support nospectre_v2 cmdline optionChristopher M. Riedl
Add support for disabling the kernel implemented spectre v2 mitigation (count cache flush on context switch) via the nospectre_v2 and mitigations=off cmdline options. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190524024647.381-1-cmr@informatik.wtf
2019-08-26selftests/powerpc: Retry on host facility unavailableGustavo Romero
TM test tm-unavailable must take into account aborts due to host aborting a transactin because of a facility unavailable exception, just like it already does for aborts on reschedules (TM_CAUSE_KVM_RESCHED). Reported-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com> Tested-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com> Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1566341651-19747-1-git-send-email-gromero@linux.vnet.ibm.com
2019-08-26selftests/powerpc: Fix and enhance TM signal context testsGustavo Romero
Currently TM signal context tests for GPR, FPR, VMX, and VSX registers print wrong register numbers (wrongly starting from register 0 instead of the first register in the non-volatile subset). Besides it the output when a mismatch happens is poor giving not much information about which context and which register mismatches, because it prints both contexts at the same time and not a comparison between the value that mismatches and the value expected and, moreover, it stops printing on the first mismatch, but it's important to know if there are other mismatches happening beyond the first one. For instance, this is the current output when a mismatch happens: test: tm_signal_context_chk_gpr tags: git_version:v5.2-8249-g02e970fae465-dirty Failed on 0 GPR 1 or 18446744073709551615 failure: tm_signal_context_chk_gpr test: tm_signal_context_chk_fpu tags: git_version:v5.2-8248-g09c289e3ef80 Failed on 0 FP -1 or -1 failure: tm_signal_context_chk_fpu test: tm_signal_context_chk_vmx tags: git_version:v5.2-8248-g09c289e3ef80 Failed on 0 vmx 0xfffffffffffffffefffffffdfffffffc vs 0xfffffffffffffffefffffffdfffffffc failure: tm_signal_context_chk_vmx test: tm_signal_context_chk_vsx tags: git_version:v5.2-8248-g09c289e3ef80 Failed on 0 vsx 0xfffffffffefffffffdfffffffcffffff vs 0xfffffffffefffffffdfffffffcffffff failure: tm_signal_context_chk_vsx This commit fixes the register numbers printed and enhances the error output by providing a full list of mismatching registers separated by the context (non-speculative or speculative context), for example: test: tm_signal_context_chk_gpr tags: git_version:v5.2-8249-g02e970fae465-dirty GPR14 (1st context) == 1 instead of -1 (expected) GPR15 (1st context) == 2 instead of -2 (expected) GPR14 (2nd context) == 0 instead of 18446744073709551615 (expected) GPR15 (2nd context) == 0 instead of 18446744073709551614 (expected) failure: tm_signal_context_chk_gpr test: tm_signal_context_chk_fpu tags: git_version:v5.2-8249-g02e970fae465-dirty FPR14 (1st context) == -1 instead of 1 (expected) FPR15 (1st context) == -2 instead of 2 (expected) failure: tm_signal_context_chk_fpu test: tm_signal_context_chk_vmx tags: git_version:v5.2-8249-g02e970fae465-dirty VMX20 (1st context) == 0xfffffffffffffffefffffffdfffffffc instead of 0x00000001000000020000000300000004 (expected) VMX21 (1st context) == 0xfffffffbfffffffafffffff9fffffff8 instead of 0x00000005000000060000000700000008 (expected) failure: tm_signal_context_chk_vmx test: tm_signal_context_chk_vsx tags: git_version:v5.2-8249-g02e970fae465-dirty VSX20 (1st context) == 0xfffffffffefffffffdfffffffcffffff instead of 0x00000001000000020000000300000004 (expected) VSX21 (1st context) == 0xfbfffffffafffffff9fffffff8ffffff instead of 0x00000005000000060000000700000008 (expected) failure: tm_signal_context_chk_vsx Finally, this commit adds comments to the tests in the hope that it will help people not so familiar with TM understand the tests. Signed-off-by: Gustavo Romero <gromero@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190814205211.24840-1-gromero@linux.ibm.com
2019-08-22powerpc/configs: Disable /dev/port in skiroot defconfigDaniel Axtens
While reviewing lockdown patches, I discovered that we still enable /dev/port (CONFIG_DEVPORT) in skiroot. /dev/port is used for old x86 style IO accesses. It's set up in drivers/char/mem.c, and is only created if arch_has_dev_port() returns true. Per arch/powerpc/include/asm/io.h, on PPC64 with PCI, this is only true if there's a legacy ISA bridge. Even if a system has a legacy ISA bridge installed, we have no business accessing it in skiroot. Deselect CONFIG_DEVPORT for skiroot. Signed-off-by: Daniel Axtens <dja@axtens.net> [mpe: Incorporate emailed comments into the change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190627053008.29315-1-dja@axtens.net
2019-08-22selftests/powerpc: Ignore generated filesGustavo Romero
Currently some binary files which are generated when tests are compiled are not ignored by git, so 'git status' catch them. For copyloops test, fix wrong binary names already in .gitignore. For ptrace, security, and stringloops tests add missing binary names to the .gitignore file. Signed-off-by: Gustavo Romero <gromero@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190814205638.25322-2-gromero@linux.ibm.com
2019-08-22powerpc: Document xmon optionsGustavo Romero
Document all options currently supported by xmon debugger. Signed-off-by: Gustavo Romero <gromero@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190814205638.25322-1-gromero@linux.ibm.com
2019-08-22powerpc/eeh: Slightly simplify eeh_add_to_parent_pe()Sam Bobroff
Simplify some needlessly complicated boolean logic in eeh_add_to_parent_pe(). Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/09259a50308f10aa764695912bc87dc1d1cf654c.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Remove unused return path from eeh_pe_dev_traverse()Sam Bobroff
There are no users of the early-out return value from eeh_pe_dev_traverse(), so remove it. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c648070f5b28fe8ca1880b48e64b267959ffd369.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Fix crash when edev->pdev changesSam Bobroff
If a PCI device is removed during eeh_pe_report_edev(), between the calls to device_lock() and device_unlock(), edev->pdev will change and cause a crash as the wrong mutex is released. To correct this, hold the PCI rescan/remove lock while taking a copy of edev->pdev and performing a get_device() on it. Use this value to release the mutex, but also pass it through to the device driver's EEH handlers so that they always see the same device. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3c590579a0faa24d20c826dcd26c739eb4d454e6.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Convert log messages to eeh_edev_* macrosSam Bobroff
Convert existing messages, where appropriate, to use the eeh_edev_* logging macros. The only effect should be minor adjustments to the log messages, apart from: - A new message in pseries_eeh_probe() "Probing device" to match the powernv case. - The "Probing device" message in pnv_eeh_probe() is now generated slightly later, which will mean that it is no longer emitted for devices that aren't probed due to the initial checks. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ce505a0a7a4a5b0367f0f40f8b26e7c0a9cf4cb7.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Introduce EEH edev logging macrosSam Bobroff
Now that struct eeh_dev includes the BDFN of it's PCI device, make use of it to replace eeh_edev_info() with a set of dev_dbg()-style macros that only need a struct edev. With the BDFN available without the struct pci_dev, eeh_pci_name() is now unnecessary, so remove it. While only the "info" level function is used here, the others will be used in followup work. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f90ae9a53d762be7b0ccbad79e62b5a1b4f4996e.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Add bdfn field to eeh_devOliver O'Halloran
Preparation for removing pci_dn from the powernv EEH code. The only thing we really use pci_dn for is to get the bdfn of the device for config space accesses, so adding that information to eeh_dev reduces the need to carry around the pci_dn. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [SB: Re-wrapped commit message, fixed whitespace damage.] Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e458eb69a1f591d8a120782f23a8506b15d3c654.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Refactor around eeh_probe_devices()Sam Bobroff
Now that EEH support for all devices (on PowerNV and pSeries) is provided by the pcibios bus add device hooks, eeh_probe_devices() and eeh_addr_cache_build() are redundant and can be removed. Move the EEH enabled message into it's own function so that it can be called from multiple places. Note that previously on pSeries, useless EEH sysfs files were created for some devices that did not have EEH support and this change prevents them from being created. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/33b0a6339d5ac88693de092d6fba984f2a5add66.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: EEH for pSeries hot plugSam Bobroff
On PowerNV and pSeries, devices currently acquire EEH support from several different places: Boot-time devices from eeh_probe_devices() and eeh_addr_cache_build(), Virtual Function devices from the pcibios bus add device hooks and hot plugged devices from pci_hp_add_devices() (with other platforms using other methods as well). Unfortunately, pSeries machines currently discover hot plugged devices using pci_rescan_bus(), not pci_hp_add_devices(), and so those devices do not receive EEH support. Rather than adding another case for pci_rescan_bus(), this change widens the scope of the pcibios bus add device hooks so that they can handle all devices. As a side effect this also supports devices discovered after manually rescanning via /sys/bus/pci/rescan. Note that on PowerNV, this change allows the EEH subsystem to become enabled after boot as long as it has not been forced off, which was not previously possible (it was already possible on pSeries). Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/72ae8ae9c54097158894a52de23690448de38ea9.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Initialize EEH address cache earlierSam Bobroff
The EEH address cache is currently initialized and populated by a single function: eeh_addr_cache_build(). While the initial population of the cache can only be done once resources are allocated, initialization (just setting up a spinlock) could be done much earlier. So move the initialization step into a separate function and call it from a core_initcall (rather than a subsys initcall). This will allow future work to make use of the cache during boot time PCI scanning. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0557206741bffee76cdfff042f65321f6f7a5b41.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Improve debug messages around device additionSam Bobroff
Also remove useless comment. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/59db84f4bf94718a12f206bc923ac797d47e4cc1.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flagSam Bobroff
The EEH_DEV_NO_HANDLER flag is used by the EEH system to prevent the use of driver callbacks in drivers that have been bound part way through the recovery process. This is necessary to prevent later stage handlers from being called when the earlier stage handlers haven't, which can be confusing for drivers. However, the flag is set for all devices that are added after boot time and only cleared at the end of the EEH recovery process. This results in hot plugged devices erroneously having the flag set during the first recovery after they are added (causing their driver's handlers to be incorrectly ignored). To remedy this, clear the flag at the beginning of recovery processing. The flag is still cleared at the end of recovery processing, although it is no longer really necessary. Also clear the flag during eeh_handle_special_event(), for the same reasons. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b8ca5629d27de74c957d4f4b250177d1b6fc4bbd.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc/64: Adjust order in pcibios_init()Sam Bobroff
The pcibios_init() function for PowerPC 64 currently calls pci_bus_add_devices() before pcibios_resource_survey(). This means that at boot time, when the pcibios_bus_add_device() hooks are called by pci_bus_add_devices(), device resources have not been allocated and they are unable to perform EEH setup, so a separate pass is needed. This patch adjusts that order so that it will become possible to consolidate the EEH setup work into a single location. The only functional change is to execute pcibios_resource_survey() (excepting ppc_md.pcibios_fixup(), see below) before pci_bus_add_devices() instead of after it. Because pcibios_scan_phb() and pci_bus_add_devices() are called together in a loop, this must be broken into one loop for each call. Then the call to pcibios_resource_survey() is moved up in between them. This changes the ordering but because pcibios_resource_survey() also calls ppc_md.pcibios_fixup(), that call is extracted out into pcibios_init() to where pcibios_resource_survey() was, so that it is not moved. The only other caller of pcibios_resource_survey() is the PowerPC 32 version of pcibios_init(), and therefore, that is modified to call ppc_md.pcibios_fixup() right after pcibios_resource_survey() so that there is no functional change there at all. The re-arrangement will cause very few side-effects because at this stage in the boot, pci_bus_add_devices() does very little: - pci_create_sysfs_dev_files() does nothing (no sysfs yet) - pci_proc_attach_device() does nothing (no proc yet) - device_attach() does nothing (no drivers yet) This leaves only the pci_final_fixup calls, D3 support, and marking the device as added. Of those, only the pci_final_fixup calls have the potential to be affected by resource allocation. The only pci_final_fixup handlers that touch resources seem to be one for x86 (pci_amd_enable_64bit_bar()), and a PowerPC 32 platform driver (quirk_final_uli1575()), neither of which use this pcibios_init() function. Even if they did, it would almost certainly be a bug, under the current ordering, to rely on or make changes to resources before they were allocated. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4506b0489eabd0921a3587d90bd44c7683f3472d.1565930772.git.sbobroff@linux.ibm.com
2019-08-22powerpc: remove meaningless KBUILD_ARFLAGS additionMasahiro Yamada
The KBUILD_ARFLAGS addition in arch/powerpc/Makefile has never worked in a useful way because it is always overridden by the following code in the top Makefile: # use the deterministic mode of AR if available KBUILD_ARFLAGS := $(call ar-option,D) The code in the top Makefile was added in 2011, by commit 40df759e2b9e ("kbuild: Fix build with binutils <= 2.19"). The KBUILD_ARFLAGS addition for ppc has always been dead code from the beginning. Nobody has reported a problem since 43c9127d94d6 ("powerpc: Add option to use thin archives"), so this code was unneeded. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190713032106.8509-1-yamada.masahiro@socionext.com
2019-08-21powerpc: add machine check safe copy_to_userSantosh Sivaraj
Use memcpy_mcsafe() implementation to define copy_to_user_mcsafe() Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-8-santosh@fossix.org
2019-08-21powerpc/memcpy: Add memcpy_mcsafe for pmemBalbir Singh
The pmem infrastructure uses memcpy_mcsafe in the pmem layer so as to convert machine check exceptions into a return value on failure in case a machine check exception is encountered during the memcpy. The return value is the number of bytes remaining to be copied. This patch largely borrows from the copyuser_power7 logic and does not add the VMX optimizations, largely to keep the patch simple. If needed those optimizations can be folded in. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [arbab@linux.ibm.com: Added symbol export] Co-developed-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-7-santosh@fossix.org
2019-08-21powerpc/mce: Handle UE event for memcpy_mcsafeBalbir Singh
If we take a UE on one of the instructions with a fixup entry, set nip to continue execution at the fixup entry. Stop processing the event further or print it. Co-developed-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-6-santosh@fossix.org
2019-08-21extable: Add function to search only kernel exception tableSantosh Sivaraj
Certain architecture specific operating modes (e.g., in powerpc machine check handler that is unable to access vmalloc memory), the search_exception_tables cannot be called because it also searches the module exception tables if entry is not found in the kernel exception table. Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-5-santosh@fossix.org
2019-08-21powerpc/mce: Make machine_check_ue_event() staticReza Arbab
The function doesn't get used outside this file, so make it static. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-4-santosh@fossix.org
2019-08-21powerpc/mce: Fix MCE handling for huge pagesBalbir Singh
The current code would fail on huge pages addresses, since the shift would be incorrect. Use the correct page shift value returned by __find_linux_pte() to get the correct physical address. The code is more generic and can handle both regular and compound pages. Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors") Signed-off-by: Balbir Singh <bsingharora@gmail.com> [arbab@linux.ibm.com: Fixup pseries_do_memory_failure()] Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-3-santosh@fossix.org
2019-08-21powerpc/mce: Schedule work from irq_workSantosh Sivaraj
schedule_work() cannot be called from MCE exception context as MCE can interrupt even in interrupt disabled context. Fixes: 733e4a4c4467 ("powerpc/mce: hookup memory_failure for UE errors") Cc: stable@vger.kernel.org # v4.15+ Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-2-santosh@fossix.org
2019-08-20powerpc/pseries/mobility: use cond_resched when updating device treeNathan Lynch
After a partition migration, pseries_devicetree_update() processes changes to the device tree communicated from the platform to Linux. This is a relatively heavyweight operation, with multiple device tree searches, memory allocations, and conversations with partition firmware. There's a few levels of nested loops which are bounded only by decisions made by the platform, outside of Linux's control, and indeed we have seen RCU stalls on large systems while executing this call graph. Use cond_resched() in these loops so that the cpu is yielded when needed. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802192926.19277-4-nathanl@linux.ibm.com
2019-08-20powerpc/rtas: allow rescheduling while changing cpu statesNathan Lynch
rtas_cpu_state_change_mask() potentially operates on scores of cpus, so explicitly allow rescheduling in the loop body. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802192926.19277-3-nathanl@linux.ibm.com
2019-08-20powerpc/rtas: use device model APIs and serialization during LPMNathan Lynch
The LPAR migration implementation and userspace-initiated cpu hotplug can interleave their executions like so: 1. Set cpu 7 offline via sysfs. 2. Begin a partition migration, whose implementation requires the OS to ensure all present cpus are online; cpu 7 is onlined: rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up This sets cpu 7 online in all respects except for the cpu's corresponding struct device; dev->offline remains true. 3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is already online and returns success. The driver core (device_online) sets dev->offline = false. 4. The migration completes and restores cpu 7 to offline state: rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down This leaves cpu7 in a state where the driver core considers the cpu device online, but in all other respects it is offline and unused. Attempts to online the cpu via sysfs appear to succeed but the driver core actually does not pass the request to the lower-level cpuhp support code. This makes the cpu unusable until the cpu device is manually set offline and then online again via sysfs. Instead of directly calling cpu_up/cpu_down, the migration code should use the higher-level device core APIs to maintain consistent state and serialize operations. Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802192926.19277-2-nathanl@linux.ibm.com
2019-08-20powerpc/603: Fix handling of the DIRTY flagChristophe Leroy
If a page is already mapped RW without the DIRTY flag, the DIRTY flag is never set and a TLB store miss exception is taken forever. This is easily reproduced with the following app: void main(void) { volatile char *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); *ptr = *ptr; } When DIRTY flag is not set, bail out of TLB miss handler and take a minor page fault which will set the DIRTY flag. Fixes: f8b58c64eaef ("powerpc/603: let's handle PAGE_DIRTY directly") Cc: stable@vger.kernel.org # v5.1+ Reported-by: Doug Crawford <doug.crawford@intelight-its.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/80432f71194d7ee75b2f5043ecf1501cf1cca1f3.1566196646.git.christophe.leroy@c-s.fr
2019-08-20powerpc/64s/radix: Remove redundant pfn_pte bitop, add VM_BUG_ONNicholas Piggin
pfn_pte is never given a pte above the addressable physical memory limit, so the masking is redundant. In case of a software bug, it is not obviously better to silently truncate the pfn than to corrupt the pte (either one will result in memory corruption or crashes), so there is no reason to add this to the fast path. Add VM_BUG_ON to catch cases where the pfn is invalid. These would catch the create_section_mapping bug fixed by a previous commit. [16885.256466] ------------[ cut here ]------------ [16885.256492] kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612! cpu 0x0: Vector: 700 (Program Check) at [c0000000ee0a36d0] pc: c000000000080738: __map_kernel_page+0x248/0x6f0 lr: c000000000080ac0: __map_kernel_page+0x5d0/0x6f0 sp: c0000000ee0a3960 msr: 9000000000029033 current = 0xc0000000ec63b400 paca = 0xc0000000017f0000 irqmask: 0x03 irq_happened: 0x01 pid = 85, comm = sh kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612! Linux version 5.3.0-rc1-00001-g0fe93e5f3394 enter ? for help [c0000000ee0a3a00] c000000000d37378 create_physical_mapping+0x260/0x360 [c0000000ee0a3b10] c000000000d370bc create_section_mapping+0x1c/0x3c [c0000000ee0a3b30] c000000000071f54 arch_add_memory+0x74/0x130 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190724084638.24982-5-npiggin@gmail.com
2019-08-20powerpc/64: Add VIRTUAL_BUG_ON checks for __va and __pa addressesNicholas Piggin
Ensure __va is given a physical address below PAGE_OFFSET, and __pa is given a virtual address above PAGE_OFFSET. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190724084638.24982-4-npiggin@gmail.com
2019-08-20powerpc/perf: fix imc allocation failure handlingNicholas Piggin
The alloc_pages_node return value should be tested for failure before being passed to page_address. Tested-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190724084638.24982-3-npiggin@gmail.com
2019-08-20powerpc/64s/radix: Fix memory hot-unplug page table splitNicholas Piggin
create_physical_mapping expects physical addresses, but splitting these mapping on hot unplug is supplying virtual (effective) addresses. Fixes: 4dd5f8a99e791 ("powerpc/mm/radix: Split linear mapping on hot-unplug") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190724084638.24982-2-npiggin@gmail.com