summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-17powerpc/85xx: remove mostly pointless mpc85xx_qe_init()Rasmus Villemoes
Since commit 302c059f2e7b (QE: use subsys_initcall to init qe), mpc85xx_qe_init() has done nothing apart from possibly emitting a pr_err(). As part of reducing the amount of QE-related code in arch/powerpc/ (and eventually support QE on other architectures), remove this low-hanging fruit. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Scott Wood <oss@buserror.net>
2019-11-17powerpc/kmcent2: update the ethernet devices' phy propertiesValentin Longchamp
Change all phy-connection-type properties to phy-mode that are better supported by the fman driver. Use the more readable fixed-link node for the 2 sgmii links. Change the RGMII link to rgmii-id as the clock delays are added by the phy. Signed-off-by: Valentin Longchamp <valentin@longchamp.me> Acked-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
2019-11-13powerpc/fadump: when fadump is supported register the fadump sysfs files.Michal Suchanek
Currently it is not possible to distinguish the case when fadump is supported by firmware and disabled in kernel and completely unsupported using the kernel sysfs interface. User can investigate the devicetree but it is more reasonable to provide sysfs files in case we get some fadumpv2 in the future. With this patch sysfs files are available whenever fadump is supported by firmware. There is duplicate message about lack of support by firmware in fadump_reserve_mem and setup_fadump. Remove the duplicate message in setup_fadump. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191107164757.15140-1-msuchanek@suse.de
2019-11-13powerpc/perf: remove current_is_64bit()Michal Suchanek
Since commit ed1cd6deb013 ("powerpc: Activate CONFIG_THREAD_INFO_IN_TASK") current_is_64bit() is quivalent to !is_32bit_task(). Remove the redundant function. Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190912194633.12045-1-msuchanek@suse.de
2019-11-13powerpc/eeh: differentiate duplicate detection messageSam Bobroff
Currently when an EEH error is detected, the system log receives the same (or almost the same) message twice: EEH: PHB#0 failure detected, location: N/A EEH: PHB#0 failure detected, location: N/A or EEH: eeh_dev_check_failure: Frozen PHB#0-PE#0 detected EEH: Frozen PHB#0-PE#0 detected This looks like a bug, but in fact the messages are from different functions and mean slightly different things. So keep both but change one of the messages slightly, so that it's clear they are different: EEH: PHB#0 failure detected, location: N/A EEH: Recovering PHB#0, location: N/A or EEH: eeh_dev_check_failure: Frozen PHB#0-PE#0 detected EEH: Recovering PHB#0-PE#0 Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/43817cb6e6631b0828b9a6e266f60d1f8ca8eb22.1571288375.git.sbobroff@linux.ibm.com
2019-11-13powerpc/pseries/hotplug-memory: Change rc variable to boolLeonardo Bras
Changes the return variable to bool (as the return value) and avoids doing a ternary operation before returning. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802133914.30413-1-leonardo@linux.ibm.com
2019-11-13powerpc: use <asm-generic/dma-mapping.h>Christoph Hellwig
The powerpc version of dma-mapping.h only contains a version of get_arch_dma_ops that always return NULL. Replace it with the asm-generic version that does the same. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190807150752.17894-1-hch@lst.de
2019-11-13powerpc/xive: Prevent page fault issues in the machine crash handlerCédric Le Goater
When the machine crash handler is invoked, all interrupts are masked but interrupts which have not been started yet do not have an ESB page mapped in the Linux address space. This crashes the 'crash kexec' sequence on sPAPR guests. To fix, force the mapping of the ESB page when an interrupt is being mapped in the Linux IRQ number space. This is done by setting the initial state of the interrupt to OFF which is not necessarily the case on PowerNV. Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031063100.3864-1-clg@kaod.org
2019-11-13powerpc/64s/exception: Fix kaup -> kuap typoAndrew Donnellan
It's KUAP, not KAUP. Fix typo in INT_COMMON macro. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191022060603.24101-1-ajd@linux.ibm.com
2019-11-13powerpc: Replace GPL boilerplate with SPDX identifiersThomas Huth
The FSF does not reside in "675 Mass Ave, Cambridge" anymore... let's simply use proper SPDX identifiers instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190828060737.32531-1-thuth@redhat.com
2019-11-13powerpc/book3s/mm: Update Oops message to print the correct translation in useAneesh Kumar K.V
Avoids confusion when printing Oops message like below Faulting instruction address: 0xc00000000008bdb4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV This was because we never clear the MMU_FTR_HPTE_TABLE feature flag even if we run with radix translation. It was discussed that we should look at this feature flag as an indication of the capability to run hash translation and we should not clear the flag even if we run in radix translation. All the code paths check for radix_enabled() check and if found true consider we are running with radix translation. Follow the same sequence for finding the MMU translation string to be used in Oops message. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190711145814.17970-1-aneesh.kumar@linux.ibm.com
2019-11-13powerpc/spufs: remove set but not used variable 'ctx'YueHaibing
arch/powerpc/platforms/cell/spufs/inode.c:201:22: warning: variable ctx set but not used [-Wunused-but-set-variable] It is not used since commit 67cba9fd6456 ("move spu_forget() into spufs_rmdir()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191023134423.15052-1-yuehaibing@huawei.com
2019-11-13powerpc/powernv/ioda: using kfree_rcu() to simplify the codeYueHaibing
The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190711141818.18044-1-yuehaibing@huawei.com
2019-11-13powerpc/powernv: Make some symbols staticYueHaibing
Fix sparse warnings: arch/powerpc/platforms/powernv/opal-psr.c:20:1: warning: symbol 'psr_mutex' was not declared. Should it be static? arch/powerpc/platforms/powernv/opal-psr.c:27:3: warning: symbol 'psr_attrs' was not declared. Should it be static? arch/powerpc/platforms/powernv/opal-powercap.c:20:1: warning: symbol 'powercap_mutex' was not declared. Should it be static? arch/powerpc/platforms/powernv/opal-sensor-groups.c:20:1: warning: symbol 'sg_mutex' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190702131733.44100-1-yuehaibing@huawei.com
2019-11-13powerpc/configs: remove obsolete CONFIG_INET_XFRM_MODE_* and ↵YueHaibing
CONFIG_INET6_XFRM_MODE_* These Kconfig options has been removed in commit 4c145dce2601 ("xfrm: make xfrm modes builtin") So there is no point to keep it in defconfigs any longer. Signed-off-by: YueHaibing <yuehaibing@huawei.com> [mpe: Extract from cross arch patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190612071901.21736-1-yuehaibing@huawei.com
2019-11-13powerpc/pseries: Fix platform_no_drv_owner.cocci warningsYueHaibing
Remove .owner field if calls are used which set it automatically Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190218133950.95225-1-yuehaibing@huawei.com
2019-11-13powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init()YueHaibing
There is no need to have the 'struct dentry *vpa_dir' variable static since new value always be assigned before use it. Fixes: c6c26fb55e8e ("powerpc/pseries: Export raw per-CPU VPA data via debugfs") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190218125644.87448-1-yuehaibing@huawei.com
2019-11-13powerpc/powernv/npu: Fix debugfs_simple_attr.cocci warningsYueHaibing
Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1545705876-63132-1-git-send-email-yuehaibing@huawei.com
2019-11-13powerpc/64s: Fix debugfs_simple_attr.cocci warningsYueHaibing
Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1543498518-107601-1-git-send-email-yuehaibing@huawei.com
2019-11-13powerpc/pseries: Use correct event modifier in rtas_parse_epow_errlog()YueHaibing
rtas_parse_epow_errlog() should pass 'modifier' to handle_system_shutdown, because event modifier only use bottom 4 bits. Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191023134838.21280-1-yuehaibing@huawei.com
2019-11-13powerpc/watchpoint: Support for 8xx in ptrace-hwbreak.c selftestRavi Bangoria
On the 8xx, signals are generated after executing the instruction. So no need to manually single-step on 8xx. Also, 8xx __set_dabr() currently ignores length and hardcodes the length to 8 bytes. So all unaligned and 512 byte testcase will fail on 8xx. Ignore those testcases on 8xx. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-8-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Add DAR outside test in perf-hwbreak.c selftestRavi Bangoria
So far we used to ignore exception if DAR points outside of user specified range. But now we are ignoring it only if actual load/store range does not overlap with user specified range. Include selftests for the same: # ./tools/testing/selftests/powerpc/ptrace/perf-hwbreak ... TESTED: No overlap TESTED: Partial overlap TESTED: Partial overlap TESTED: No overlap TESTED: Full overlap success: perf_hwbreak Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-7-ravi.bangoria@linux.ibm.com
2019-11-13selftests/powerpc: Rewrite ptrace-hwbreak.c selftestRavi Bangoria
ptrace-hwbreak.c selftest is logically broken. On powerpc, when watchpoint is created with ptrace, signals are generated before executing the instruction and user has to manually singlestep the instruction with watchpoint disabled, which selftest never does and thus it keeps on getting the signal at the same instruction. If we fix it, selftest fails because the logical connection between tracer(parent) and tracee(child) is also broken. Rewrite the selftest and add new tests for unaligned access. With patch: $ ./tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak test: ptrace-hwbreak tags: git_version:powerpc-5.3-4-224-g218b868240c7-dirty PTRACE_SET_DEBUGREG, WO, len: 1: Ok PTRACE_SET_DEBUGREG, WO, len: 2: Ok PTRACE_SET_DEBUGREG, WO, len: 4: Ok PTRACE_SET_DEBUGREG, WO, len: 8: Ok PTRACE_SET_DEBUGREG, RO, len: 1: Ok PTRACE_SET_DEBUGREG, RO, len: 2: Ok PTRACE_SET_DEBUGREG, RO, len: 4: Ok PTRACE_SET_DEBUGREG, RO, len: 8: Ok PTRACE_SET_DEBUGREG, RW, len: 1: Ok PTRACE_SET_DEBUGREG, RW, len: 2: Ok PTRACE_SET_DEBUGREG, RW, len: 4: Ok PTRACE_SET_DEBUGREG, RW, len: 8: Ok PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO, len: 1: Ok PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RO, len: 1: Ok PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RW, len: 1: Ok PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, WO, len: 6: Ok PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, RO, len: 6: Ok PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, RW, len: 6: Ok PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW UNALIGNED, WO, len: 6: Ok PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW UNALIGNED, RO, len: 6: Ok PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW UNALIGNED, RW, len: 6: Ok PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW UNALIGNED, DAR OUTSIDE, RW, len: 6: Ok PPC_PTRACE_SETHWDEBUG, DAWR_MAX_LEN, RW, len: 512: Ok success: ptrace-hwbreak Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-6-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Don't ignore extraneous exceptions blindlyRavi Bangoria
On powerpc, watchpoint match range is double-word granular. On a watchpoint hit, DAR is set to the first byte of overlap between actual access and watched range. And thus it's quite possible that DAR does not point inside user specified range. Ex, say user creates a watchpoint with address range 0x1004 to 0x1007. So hw would be configured to watch from 0x1000 to 0x1007. If there is a 4 byte access from 0x1002 to 0x1005, DAR will point to 0x1002 and thus interrupt handler considers it as extraneous, but it's actually not, because part of the access belongs to what user has asked. Instead of blindly ignoring the exception, get actual address range by analysing an instruction, and ignore only if actual range does not overlap with user specified range. Note: The behavior is unchanged for 8xx. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-5-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Fix ptrace code that muck around with address/lenRavi Bangoria
ptrace_set_debugreg() does not consider new length while overwriting the watchpoint. Fix that. ppc_set_hwdebug() aligns watchpoint address to doubleword boundary but does not change the length. If address range is crossing doubleword boundary and length is less then 8, we will lose samples from second doubleword. So fix that as well. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-4-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Fix length calculation for unaligned targetRavi Bangoria
Watchpoint match range is always doubleword(8 bytes) aligned on powerpc. If the given range is crossing doubleword boundary, we need to increase the length such that next doubleword also get covered. Ex, address len = 6 bytes |=========. |------------v--|------v--------| | | | | | | | | | | | | | | | | | |---------------|---------------| <---8 bytes---> In such case, current code configures hw as: start_addr = address & ~HW_BREAKPOINT_ALIGN len = 8 bytes And thus read/write in last 4 bytes of the given range is ignored. Fix this by including next doubleword in the length. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-3-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/watchpoint: Introduce macros for watchpoint lengthRavi Bangoria
We are hadrcoding length everywhere in the watchpoint code. Introduce macros for the length and use them. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191017093204.7511-2-ravi.bangoria@linux.ibm.com
2019-11-13powerpc/security: Fix wrong message when RFI Flush is disableGustavo L. F. Walbon
The issue was showing "Mitigation" message via sysfs whatever the state of "RFI Flush", but it should show "Vulnerable" when it is disabled. If you have "L1D private" feature enabled and not "RFI Flush" you are vulnerable to meltdown attacks. "RFI Flush" is the key feature to mitigate the meltdown whatever the "L1D private" state. SEC_FTR_L1D_THREAD_PRIV is a feature for Power9 only. So the message should be as the truth table shows: CPU | L1D private | RFI Flush | sysfs ----|-------------|-----------|------------------------------------- P9 | False | False | Vulnerable P9 | False | True | Mitigation: RFI Flush P9 | True | False | Vulnerable: L1D private per thread P9 | True | True | Mitigation: RFI Flush, L1D private per thread P8 | False | False | Vulnerable P8 | False | True | Mitigation: RFI Flush Output before this fix: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: RFI Flush, L1D private per thread # echo 0 > /sys/kernel/debug/powerpc/rfi_flush # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: L1D private per thread Output after fix: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: RFI Flush, L1D private per thread # echo 0 > /sys/kernel/debug/powerpc/rfi_flush # cat /sys/devices/system/cpu/vulnerabilities/meltdown Vulnerable: L1D private per thread Signed-off-by: Gustavo L. F. Walbon <gwalbon@linux.ibm.com> Signed-off-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190502210907.42375-1-gwalbon@linux.ibm.com
2019-11-13powerpc/crypto: Add cond_resched() in crc-vpmsum self-testChris Smart
The stress test for vpmsum implementations executes a long for loop in the kernel. This blocks the scheduler, which prevents other tasks from running, resulting in a warning. This fix adds a call to cond_reshed() at the end of each loop, which allows the scheduler to run other tasks as required. Signed-off-by: Chris Smart <chris.smart@humanservices.gov.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191103233356.5472-1-chris.smart@humanservices.gov.au
2019-11-13powerpc/pseries/cmm: Simulation modeDavid Hildenbrand
Let's allow to test the implementation without needing HW support. When "simulate=1" is specified when loading the module, we bypass all HW checks and HW calls. The sysfs file "simulate_loan_target_kb" can be used to simulate HW requests. The simualtion mode can be activated using: modprobe cmm debug=1 simulate=1 And the requested loan target can be changed using: echo X > /sys/devices/system/cmm/cmm0/simulate_loan_target_kb Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-11-david@redhat.com
2019-11-13powerpc/pseries/cmm: Switch to balloon_page_alloc()David Hildenbrand
balloon_page_alloc() will use GFP_HIGHUSER_MOVABLE in case we have CONFIG_BALLOON_COMPACTION. This is now possible, as balloon pages are movable with CONFIG_BALLOON_COMPACTION. Without CONFIG_BALLOON_COMPACTION, GFP_HIGHUSER is used. Note that apart from that, balloon_page_alloc() uses the following flags: __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN And current code used: GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC GFP_HIGHUSER/GFP_HIGHUSER_MOVABLE include __GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM GFP_NOIO is __GFP_RECLAIM. With CONFIG_BALLOON_COMPACTION, we essentially add: __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM | __GFP_MOVABLE Without CONFIG_BALLOON_COMPACTION, we essentially add: __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM I assume this is fine, as this is what all other balloon compaction users use. If it turns out to be a problem, we could add __GFP_MOVABLE manually if we have CONFIG_BALLOON_COMPACTION. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-10-david@redhat.com
2019-11-13powerpc/pseries/cmm: Implement balloon compactionDavid Hildenbrand
We can now get rid of the cmm_lock and completely rely on the balloon compaction internals, which now also manage the page list and the lock. Inflated/"loaned" pages are now movable. Memory blocks that contain such pages can get offlined. Also, all such pages will be marked PageOffline() and can therefore be excluded in memory dumps using recent versions of makedumpfile. Don't switch to balloon_page_alloc() yet (due to the GFP_NOIO). Will do that separately to discuss this change in detail. Signed-off-by: David Hildenbrand <david@redhat.com> [mpe: Add isolated_pages-- in cmm_migratepage() as suggested by David] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-9-david@redhat.com
2019-11-13powerpc/pseries/cmm: Convert loaned_pages to an atomic_long_tDavid Hildenbrand
When switching to balloon compaction, we want to drop the cmm_lock and completely rely on the balloon compaction list lock internally. loaned_pages is currently protected under the cmm_lock. Note: Right now cmm_alloc_pages() and cmm_free_pages() can be called at the same time, e.g., via the thread and a concurrent OOM notifier. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-8-david@redhat.com
2019-11-13powerpc/pseries/cmm: Rip out memory isolate notifierDavid Hildenbrand
The memory isolate notifier was added to allow to offline memory blocks that contain inflated/"loaned" pages. We can achieve the same using the balloon compaction framework. Get rid of the memory isolate notifier. Also, we can get rid of cmm_mem_going_offline(), as we will never reach that code path now when we have allocated memory in the balloon (allocated pages are unmovable and will no longer be special-cased using the memory isolation notifier). Leave the memory notifier in place, so we can still back off in case memory gets offlined. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-7-david@redhat.com
2019-11-13powerpc/pseries/cmm: Use adjust_managed_page_count() insted of totalram_pages_*David Hildenbrand
adjust_managed_page_count() performs a totalram_pages_add(), but also adjusts the managed pages of the zone. Let's use that instead, similar to virtio-balloon. Use it before freeing a page. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-6-david@redhat.com
2019-11-13powerpc/pseries/cmm: Drop page arrayDavid Hildenbrand
We can simply store the pages in a list (page->lru), no need for a separate data structure (+ complicated handling). This is how most other balloon drivers store allocated pages without additional tracking data. For the notifiers, use page_to_pfn() to check if a page is in the applicable range. Use page_to_phys() in plpar_page_set_loaned() and plpar_page_set_active() (I assume due to the __pa() that's the right thing to do). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-5-david@redhat.com
2019-11-13powerpc/pseries/cmm: Cleanup rc handling in cmm_init()David Hildenbrand
No need to initialize rc. Also, let's return 0 directly when succeeding. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-4-david@redhat.com
2019-11-13powerpc/pseries/cmm: Report errors when registering notifiers failsDavid Hildenbrand
If we don't set the rc, we will return "0", making it look like we succeeded. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-3-david@redhat.com
2019-11-13powerpc/pseries/cmm: Implement release() function for sysfs deviceDavid Hildenbrand
When unloading the module, one gets ------------[ cut here ]------------ Device 'cmm0' does not have a release() function, it is broken and must be fixed. See Documentation/kobject.txt. WARNING: CPU: 0 PID: 19308 at drivers/base/core.c:1244 .device_release+0xcc/0xf0 ... We only have one static fake device. There is nothing to do when releasing the device (via cmm_exit()). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031142933.10779-2-david@redhat.com
2019-11-13powerpc/pseries: Enable support for ibm,drc-info propertyTyrel Datwyler
Advertise client support for the PAPR architected ibm,drc-info device tree property during CAS handshake. Fixes: c7a3275e0f9e ("powerpc/pseries: Revert support for ibm,drc-info devtree property") Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-11-git-send-email-tyreld@linux.ibm.com
2019-11-13PCI: rpaphp: Correctly match ibm, my-drc-index to drc-name when using drc-infoTyrel Datwyler
The newer ibm,drc-info property is a condensed description of the old ibm,drc-* properties (ie. names, types, indexes, and power-domains). When matching a drc-index to a drc-name we need to verify that the index is within the start and last drc-index range and map it to a drc-name using the drc-name-prefix and logical index. Fix the mapping by checking that the index is within the range of the current drc-info entry, and build the name from the drc-name-prefix concatenated with the starting drc-name-suffix value and the sequential index obtained by subtracting ibm,my-drc-index from this entries drc-start-index. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-10-git-send-email-tyreld@linux.ibm.com
2019-11-13PCI: rpaphp: Annotate and correctly byte swap DRC propertiesTyrel Datwyler
The device tree is in big endian format and any properties directly retrieved using OF helpers that don't explicitly byte swap should be annotated. In particular there are several places where we grab the opaque property value for the old ibm,drc-* properties and the ibm,my-drc-index property. Fix this for better static checking by annotating values we know to explicitly big endian, and byte swap where appropriate. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-9-git-send-email-tyreld@linux.ibm.com
2019-11-13PCI: rpaphp: Add drc-info support for hotplug slot registrationTyrel Datwyler
Split physical PCI slot registration scanning into separate routines that support the old ibm,drc-* properties and one that supports the new compressed ibm,drc-info property. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-7-git-send-email-tyreld@linux.ibm.com
2019-11-13PCI: rpaphp: Don't rely on firmware feature to imply drc-info supportTyrel Datwyler
In the event that the partition is migrated to a platform with older firmware that doesn't support the ibm,drc-info property the device tree is modified to remove the ibm,drc-info property and replace it with the older style ibm,drc-* properties for types, names, indexes, and power-domains. One of the requirements of the drc-info firmware feature is that the client is able to handle both the new property, and old style properties at runtime. Therefore we can't rely on the firmware feature alone to dictate which property is currently present in the device tree. Fix this short coming by checking explicitly for the ibm,drc-info property, and falling back to the older ibm,drc-* properties if it doesn't exist. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-6-git-send-email-tyreld@linux.ibm.com
2019-11-13PCI: rpaphp: Fix up pointer to first drc-info entryTyrel Datwyler
The first entry of the ibm,drc-info property is an int encoded count of the number of drc-info entries that follow. The "value" pointer returned by of_prop_next_u32() is still pointing at the this value when we call of_read_drc_info_cell(), but the helper function expects that value to be pointing at the first element of an entry. Fix up by incrementing the "value" pointer to point at the first element of the first drc-info entry prior. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-5-git-send-email-tyreld@linux.ibm.com
2019-11-13powerpc/pseries: Add cpu DLPAR support for drc-info propertyTyrel Datwyler
Older firmwares provided information about Dynamic Reconfig Connectors (DRC) through several device tree properties, namely ibm,drc-types, ibm,drc-indexes, ibm,drc-names, and ibm,drc-power-domains. New firmwares have the ability to present this same information in a much condensed format through a device tree property called ibm,drc-info. The existing cpu DLPAR hotplug code only understands the older DRC property format when validating the drc-index of a cpu during a hotplug add. This updates those code paths to use the ibm,drc-info property, when present, instead for validation. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-4-git-send-email-tyreld@linux.ibm.com
2019-11-13powerpc/pseries: Fix drc-info mappings of logical cpus to drc-indexTyrel Datwyler
There are a couple subtle errors in the mapping between cpu-ids and a cpus associated drc-index when using the new ibm,drc-info property. The first is that while drc-info may have been a supported firmware feature at boot it is possible we have migrated to a CEC with older firmware that doesn't support the ibm,drc-info property. In that case the device tree would have been updated after migration to remove the ibm,drc-info property and replace it with the older style ibm,drc-* properties for types, indexes, names, and power-domains. PAPR even goes as far as dictating that if we advertise support for drc-info that we are capable of supporting either property type at runtime. The second is that the first value of the ibm,drc-info property is the int encoded count of drc-info entries. As such "value" returned by of_prop_next_u32() is pointing at that count, and not the first element of the first drc-info entry as is expected by the of_read_drc_info_cell() helper. Fix the first by ignoring DRC-INFO firmware feature and instead testing directly for ibm,drc-info, and then falling back to the old style ibm,drc-indexes in the case it doesn't exit. Fix the second by incrementing value to the next element prior to parsing drc-info entries. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-3-git-send-email-tyreld@linux.ibm.com
2019-11-13powerpc/pseries: Fix bad drc_index_start value parsing of drc-info entryTyrel Datwyler
The ibm,drc-info property is an array property that contains drc-info entries such that each entry is made up of 2 string encoded elements followed by 5 int encoded elements. The of_read_drc_info_cell() helper contains comments that correctly name the expected elements and their encoding. However, the usage of of_prop_next_string() and of_prop_next_u32() introduced a subtle skippage of the first u32. This is a result of of_prop_next_string() returning a pointer to the next property value which is not a string, but actually a (__be32 *). As, a result the following call to of_prop_next_u32() passes over the current int encoded value and actually stores the next one wrongly. Simply endian swap the current value in place after reading the first two string values. The remaining int encoded values can then be read correctly using of_prop_next_u32(). Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573449697-5448-2-git-send-email-tyreld@linux.ibm.com
2019-11-13Merge branch 'topic/secureboot' into nextMichael Ellerman
Merge the secureboot support, as well as the IMA changes needed to support it. From Nayna's cover letter: In order to verify the OS kernel on PowerNV systems, secure boot requires X.509 certificates trusted by the platform. These are stored in secure variables controlled by OPAL, called OPAL secure variables. In order to enable users to manage the keys, the secure variables need to be exposed to userspace. OPAL provides the runtime services for the kernel to be able to access the secure variables. This patchset defines the kernel interface for the OPAL APIs. These APIs are used by the hooks, which load these variables to the keyring and expose them to the userspace for reading/writing. Overall, this patchset adds the following support: * expose secure variables to the kernel via OPAL Runtime API interface * expose secure variables to the userspace via kernel sysfs interface * load kernel verification and revocation keys to .platform and .blacklist keyring respectively. The secure variables can be read/written using simple linux utilities cat/hexdump. For example: Path to the secure variables is: /sys/firmware/secvar/vars Each secure variable is listed as directory. $ ls -l total 0 drwxr-xr-x. 2 root root 0 Aug 20 21:20 db drwxr-xr-x. 2 root root 0 Aug 20 21:20 KEK drwxr-xr-x. 2 root root 0 Aug 20 21:20 PK The attributes of each of the secure variables are (for example: PK): $ ls -l total 0 -r--r--r--. 1 root root 4096 Oct 1 15:10 data -r--r--r--. 1 root root 65536 Oct 1 15:10 size --w-------. 1 root root 4096 Oct 1 15:12 update The "data" is used to read the existing variable value using hexdump. The data is stored in ESL format. The "update" is used to write a new value using cat. The update is to be submitted as AUTH file.
2019-11-13powerpc: Load firmware trusted keys/hashes into kernel keyringNayna Jain
The keys used to verify the Host OS kernel are managed by firmware as secure variables. This patch loads the verification keys into the .platform keyring and revocation hashes into .blacklist keyring. This enables verification and loading of the kernels signed by the boot time keys which are trusted by firmware. Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Eric Richter <erichte@linux.ibm.com> [mpe: Search by compatible in load_powerpc_certs(), not using format] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573441836-3632-5-git-send-email-nayna@linux.ibm.com