summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2012-07-10powerpc: Modify macro ready for %r0 register changeMichael Neuling
The assembler doesn't take %r0 register arguments in braces, so remove them. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Add defines for R0-R31Michael Neuling
We are going to use these later and convert r0 to %r0 etc. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10Merge branch 'merge' into nextBenjamin Herrenschmidt
We want to bring in the latest IRQ fixes
2012-07-10powerpc/numa: Avoid stupid uninitialized warning from gccBenjamin Herrenschmidt
Newer gcc are being a bit blind here (it's pretty obvious we don't reach the code path using the array if we haven't initialized the pointer) but none of that is performance critical so let's just silence it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Fix build of some debug irq codeBenjamin Herrenschmidt
There was a typo, checking for CONFIG_TRACE_IRQFLAG instead of CONFIG_TRACE_IRQFLAGS causing some useful debug code to not be built This in turns causes a build error on BookE 64-bit due to incorrect semicolons at the end of a couple of macros, so let's fix that too Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: stable@vger.kernel.org [v3.4]
2012-07-10powerpc: More fixes for lazy IRQ vs. idleBenjamin Herrenschmidt
Looks like we still have issues with pSeries and Cell idle code vs. the lazy irq state. In fact, the reset fixes that went upstream are exposing the problem more by causing BUG_ON() to trigger (which this patch turns into a WARN_ON instead). We need to be careful when using a variant of low power state that has the side effect of turning interrupts back on, to properly set all the SW & lazy state to look as if everything is enabled before we enter the low power state with MSR:EE off as we will return with MSR:EE on. If not, we have a discrepancy of state which can cause things to go very wrong later on. This patch moves the logic into a helper and uses it from the pseries and cell idle code. The power4/970 idle code already got things right (in assembly even !) so I'm not touching it. The power7 "bare metal" idle code is subtly different and correct. Remains PA6T and some hypervisor based Cell platforms which have questionable code in there, but they are mostly dead platforms so I'll fix them when I manage to get final answers from the respective maintainers about how the low power state actually works on them. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: stable@vger.kernel.org [v3.4]
2012-07-07Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Last merge window, we had some updates from Al cleaning up the signal restart handling. These have caused some problems on ARM, and while Al has some fixes, we have some concerns with Al's patches but we've been unsuccesful with discussing this. We have got to the point where we need to do something, and we've decided that the best solution is to revert the appropriate commits until Al is able to reply to us. Also included here are four patches to fix warnings that I've noticed in my build system, and one fix for kprobes test code." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: fix warning caused by wrongly typed arm_dma_limit ARM: fix warnings about atomic64_read ARM: 7440/1: kprobes: only test 'sub pc, pc, #1b-2b+8-2' on ARMv6 ARM: 7441/1: perf: return -EOPNOTSUPP if requested mode exclusion is unavailable ARM: 7443/1: Revert "new way of handling ERESTART_RESTARTBLOCK" ARM: 7442/1: Revert "remove unused restart trampoline" ARM: fix set_domain() macro ARM: fix mach-versatile/pci.c warning
2012-07-05Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Small fixes on multiple ARM platforms - A build regression from a previous fix on dove and mv78xx0 - Two fixes for recently (3.5-rc1) changed mmp/pxa code - multiple omap2+ bug fixes - two trivial fixes for i.MX - one v3.5 regression for mxs" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: apx4devkit: fix FEC enabling PHY clock ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4 ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks ARM: Orion: Fix WDT compile for Dove and MV78xx0 ARM: mmp: remove mach/gpio-pxa.h ARM: imx: assert SCC gate stays enabled ARM: OMAP4: TWL6030: ensure sys_nirq1 is mux'd and wakeup enabled ARM: OMAP2: Overo: init I2C before MMC to fix MMC suspend/resume failure ARM: imx27_visstrim_m10: Do not include <asm/system.h> ARM: pxa: hx4700: Fix basic suspend/resume
2012-07-05Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fix from Marcelo Tosatti: "Memory leak and oops on the x86 mmu code, and sanitization of the KVM_IRQFD ioctl." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: MMU: fix shrinking page from the empty mmu KVM: fix fault page leak KVM: Sanitize KVM_IRQFD flags KVM: Add missing KVM_IRQFD API documentation KVM: Pass kvm_irqfd to functions
2012-07-05ARM: fix warning caused by wrongly typed arm_dma_limitRussell King
arch/arm/mm/init.c: In function 'arm_memblock_init': arch/arm/mm/init.c:380: warning: comparison of distinct pointer types lacks a cast by fixing the typecast in its definition when DMA_ZONE is disabled. This was missed in 4986e5c7c (ARM: mm: fix type of the arm_dma_limit global variable). Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-05ARM: fix warnings about atomic64_readRussell King
Fix: net/netfilter/xt_connbytes.c: In function 'connbytes_mt': net/netfilter/xt_connbytes.c:43: warning: passing argument 1 of 'atomic64_read' discards qualifiers from pointer target type ... by adding the missing const. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-05ARM: 7440/1: kprobes: only test 'sub pc, pc, #1b-2b+8-2' on ARMv6Rabin Vincent
'sub pc, pc, #1b-2b+8-2' results in address<1:0> == '10'. sub pc, pc, #const (== ADR pc, #const) performs an interworking branch (BXWritePC()) on ARMv7+ and a simple branch (BranchWritePC()) on earlier versions. In ARM state, BXWritePC() is UNPREDICTABLE when address<1:0> == '10'. In ARM state on ARMv6+, BranchWritePC() ignores address<1:0>. Before ARMv6, BranchWritePC() is UNPREDICTABLE if address<1:0> != '00' So the instruction is UNPREDICTABLE both before and after v6. Acked-by: Jon Medhurst <tixy@yxit.co.uk> Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-05Merge tag 'omap-fixes-for-v3.5-rc5' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes PM related fixes for omaps mostly to get suspend/resume working again. * tag 'omap-fixes-for-v3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4 ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks ARM: OMAP4: TWL6030: ensure sys_nirq1 is mux'd and wakeup enabled ARM: OMAP2: Overo: init I2C before MMC to fix MMC suspend/resume failure Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-07-05Merge branch 'mxs/fixes-for-3.5' of ↵Arnd Bergmann
git://git.linaro.org/people/shawnguo/linux-2.6 into fixes * 'mxs/fixes-for-3.5' of git://git.linaro.org/people/shawnguo/linux-2.6: ARM: apx4devkit: fix FEC enabling PHY clock Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-07-05ARM: 7441/1: perf: return -EOPNOTSUPP if requested mode exclusion is unavailableWill Deacon
We currently return -EPERM if the user requests mode exclusion that is not supported by the CPU. This looks pretty confusing from userspace and is inconsistent with other architectures (ppc, x86). This patch returns -EOPNOTSUPP instead. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-05ARM: 7443/1: Revert "new way of handling ERESTART_RESTARTBLOCK"Will Deacon
This reverts commit 6b5c8045ecc7e726cdaa2a9d9c8e5008050e1252. Conflicts: arch/arm/kernel/ptrace.c The new syscall restarting code can lead to problems if we take an interrupt in userspace just before restarting the svc instruction. If a signal is delivered when returning from the interrupt, the TIF_SYSCALL_RESTARTSYS will remain set and cause any syscalls executed from the signal handler to be treated as a restart of the previously interrupted system call. This includes the final sigreturn call, meaning that we may fail to exit from the signal context. Furthermore, if a system call made from the signal handler requires a restart via the restart_block, it is possible to clear the thread flag and fail to restart the originally interrupted system call. The right solution to this problem is to perform the restarting in the kernel, avoiding the possibility of handling a further signal before the restart is complete. Since we're almost at -rc6, let's revert the new method for now and aim for in-kernel restarting at a later date. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-05ARM: 7442/1: Revert "remove unused restart trampoline"Will Deacon
This reverts commit fa18484d0947b976a769d15c83c50617493c81c1. We need the restart trampoline back so that we can revert a related problematic patch 6b5c8045ecc7e726cdaa2a9d9c8e5008050e1252 ("arm: new way of handling ERESTART_RESTARTBLOCK"). Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-05ARM: fix set_domain() macroRussell King
Avoid polluting drivers with a set_domain() macro, which interferes with structure member names: drivers/net/wireless/ath/ath9k/dfs_pattern_detector.c:294:33: error: macro "set_domain" passed 2 arguments, but takes just 1 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-05Merge tag 'omap-fixes-b-for-3.5rc' of ↵Tony Lindgren
git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes A few more OMAP fixes for 3.5-rc. These fix some bugs with power management and McBSP.
2012-07-05ARM: apx4devkit: fix FEC enabling PHY clockLauri Hintsala
Ethernet stopped to work after mxs clk framework change. Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
2012-07-04ARM: fix mach-versatile/pci.c warningRussell King
arch/arm/mach-versatile/pci.c: In function 'versatile_map_irq': arch/arm/mach-versatile/pci.c:342: warning: unused variable 'devslot' Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-04ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4Benoit Cousson
The commit 503d0ea24d1d3dd3db95e5e0edd693da7a2a23eb ARM: OMAP4: hwmod data: Add aliases for McBSP fclk clocks added a wrong "prcm_clk" alias for PRCM clock whereas the McBSP driver and previous OMAPs are using "prcm_fck". It thus lead to the following warning. [ 47.409729] omap-mcbsp: clks: could not clk_get() prcm_fck Fix that by changing the opt_clk role to prcm_fck. Reported-by: Misael Lopez Cruz <misael.lopez@ti.com> Signed-off-by: Benoit Cousson <b-cousson@ti.com> Cc: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Sebastien Guiriec <s-guiriec@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com>
2012-07-04ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and ↵Paul Walmsley
aess IP blocks The OMAP4 usb_host_fs (OHCI) and AESS IP blocks require some special programming for them to enter idle. Without this programming, they will prevent the rest of the chip from entering full chip idle. To implement the idle programming cleanly, this will take some coordination between maintainers. This is likely to take some time, so it is probably best to leave this for 3.6 or 3.7. So, in the meantime, prevent these IP blocks from being registered. Later, once the appropriate support is available, this patch can be reverted. This second version comments out the IP block data since Benoît didn't like removing it. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Benoît Cousson <b-cousson@ti.com>
2012-07-04Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixesArnd Bergmann
From Haojian Zhuang <haojian.zhuang@gmail.com>: * 'fixes' of git://github.com/hzhuang1/linux: ARM: mmp: remove mach/gpio-pxa.h Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-07-04Merge tag 'v3.5-imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 ↵Arnd Bergmann
into fixes From Sascha Hauer <s.hauer@pengutronix.de>: ARM i.MX fixes for v3.5-rc5 * tag 'v3.5-imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6: ARM: imx: assert SCC gate stays enabled ARM: imx27_visstrim_m10: Do not include <asm/system.h> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-07-04ARM: Orion: Fix WDT compile for Dove and MV78xx0Andrew Lunn
Commit 0fa1f0609a0c1fe8b2be3c0089a2cb48f7fda521 (ARM: Orion: Fix Virtual/Physical mixup with watchdog) broke the Dove & MV78xx0 build. Although these two SoC don't use the watchdog, the shared platform code still needs to build. Add the necessary defines. Cc: stable@vger.kernel.org Reported-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-07-04ARM: mmp: remove mach/gpio-pxa.hPaul Bolle
Commit 157d2644cb0c1e71a18baaffca56d2b1d0ebf10f ("ARM: pxa: change gpio to platform device") removed all includes of mach/gpio-pxa.h. It kept this unused header in the tree. Using it can't work, as it itself includes the non-existent header plat/gpio-pxa.h. This header can safely be removed. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Acked-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
2012-07-04ARM: imx: assert SCC gate stays enabledUwe Kleine-König
The SCC clock is needed in internal boot mode and so must keep enabled. This same issue was fixed for the pre-common-clk code in commit 3d6e614 (mx35: Fix boot ROM hang in internal boot mode) Cc: John Ogness <jogness@linutronix.de> Cc: Hans J. Koch <hjk@hansjkoch.de> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2012-07-03KVM: MMU: fix shrinking page from the empty mmuXiao Guangrong
Fix: [ 3190.059226] BUG: unable to handle kernel NULL pointer dereference at (null) [ 3190.062224] IP: [<ffffffffa02aac66>] mmu_page_zap_pte+0x10/0xa7 [kvm] [ 3190.063760] PGD 104f50067 PUD 112bea067 PMD 0 [ 3190.065309] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 3190.066860] CPU 1 [ ...... ] [ 3190.109629] Call Trace: [ 3190.111342] [<ffffffffa02aada6>] kvm_mmu_prepare_zap_page+0xa9/0x1fc [kvm] [ 3190.113091] [<ffffffffa02ab2f5>] mmu_shrink+0x11f/0x1f3 [kvm] [ 3190.114844] [<ffffffffa02ab25d>] ? mmu_shrink+0x87/0x1f3 [kvm] [ 3190.116598] [<ffffffff81150c9d>] ? prune_super+0x142/0x154 [ 3190.118333] [<ffffffff8110a4f4>] ? shrink_slab+0x39/0x31e [ 3190.120043] [<ffffffff8110a687>] shrink_slab+0x1cc/0x31e [ 3190.121718] [<ffffffff8110ca1d>] do_try_to_free_pages This is caused by shrinking page from the empty mmu, although we have checked n_used_mmu_pages, it is useless since the check is out of mmu-lock Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-03powerpc/mm: remove obsolete comment about page size name arrayScott Wood
The array of names in hugetlbpage.c no longer exists. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Kill flatdevtree_env.h tooPaul Bolle
Commit 430b01e8f5e524a2bfa50074d97d0bdc2505807b ("[POWERPC] Kill flatdevtree.c") killed the two files including flatdevtree_env.h. It was apparently just an oversight to not kill that header too. Kill it now. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Fix kernel-doc warningWanpeng Li
Warning(arch/powerpc/kernel/pci_of_scan.c:210): Excess function parameter 'node' description in 'of_scan_pci_bridge' Warning(arch/powerpc/kernel/vio.c:636): No description found for parameter 'desired' Warning(arch/powerpc/kernel/vio.c:636): Excess function parameter 'new_desired' description in 'vio_cmo_set_dev_desired' Warning(arch/powerpc/kernel/vio.c:1270): No description found for parameter 'viodrv' Warning(arch/powerpc/kernel/vio.c:1270): Excess function parameter 'drv' description in '__vio_register_driver' Warning(arch/powerpc/kernel/vio.c:1289): No description found for parameter 'viodrv' Warning(arch/powerpc/kernel/vio.c:1289): Excess function parameter 'driver' description in 'vio_unregister_driver' Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Fix assmption of end_of_DRAM() returns end addressBharat Bhushan
memblock_end_of_DRAM() returns end_address + 1, not end address. While some code assumes that it returns end address. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Optimise the 64bit optimised __clear_userAnton Blanchard
I blame Mikey for this. He elevated my slightly dubious testcase: to benchmark status. And naturally we need to be number 1 at creating zeros. So lets improve __clear_user some more. As Paul suggests we can use dcbz for large lengths. This patch gets the destination cacheline aligned then uses dcbz on whole cachelines. Before: 10485760000 bytes (10 GB) copied, 0.414744 s, 25.3 GB/s After: 10485760000 bytes (10 GB) copied, 0.268597 s, 39.0 GB/s 39 GB/s, a new record. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Olof Johansson <olof@lixom.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Implement IOMMU pools to improve multiqueue adapter performanceAnton Blanchard
At the moment all queues in a multiqueue adapter will serialise against the IOMMU table lock. This is proving to be a big issue, especially with 10Gbit ethernet. This patch creates 4 pools and tries to spread the load across them. If the table is under 1GB in size we revert back to the original behaviour of 1 pool and 1 largealloc pool. We create a hash to map CPUs to pools. Since we prefer interrupts to be affinitised to primary CPUs, without some form of hashing we are very likely to end up using the same pool. As an example, POWER7 has 4 way SMT and with 4 pools all primary threads will map to the same pool. The largealloc pool is reduced from 1/2 to 1/4 of the space to partially offset the overhead of breaking the table up into pools. Some performance numbers were obtained with a Chelsio T3 adapter on two POWER7 boxes, running a 100 session TCP round robin test. Performance improved 69% with this patch applied. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Push spinlock into iommu_range_alloc and __iommu_freeAnton Blanchard
In preparation for IOMMU pools, push the spinlock into iommu_range_alloc and __iommu_free. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Reduce spinlock coverage in iommu_freeAnton Blanchard
This patch moves tce_free outside of the lock in iommu_free. Some performance numbers were obtained with a Chelsio T3 adapter on two POWER7 boxes, running a 100 session TCP round robin test. Performance improved 25% with this patch applied. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Reduce spinlock coverage in iommu_alloc and iommu_freeAnton Blanchard
We currently hold the IOMMU spinlock around tce_build and tce_flush. This causes our spinlock hold times to be much higher than required and can impact multiqueue adapters. This patch moves tce_build and tce_flush outside of the lock in iommu_alloc, and tce_flush outside of the lock in iommu_free. Some performance numbers were obtained with a Chelsio T3 adapter on two POWER7 boxes, running a 100 session TCP round robin test. Performance improved 32% with this patch applied. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/pseries: Disable interrupts around IOMMU percpu data accessesAnton Blanchard
tce_buildmulti_pSeriesLP uses a per cpu page to communicate with the hypervisor. We currently rely on the IOMMU table spinlock but subsequent patches will be removing that so disable interrupts around all accesses of tce_page. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: POWER7 optimised memcpy using VMX and enhanced prefetchAnton Blanchard
Implement a POWER7 optimised memcpy using VMX and enhanced prefetch instructions. This is a copy of the POWER7 optimised copy_to_user/copy_from_user loop. Detailed implementation and performance details can be found in commit a66086b8197d (powerpc: POWER7 optimised copy_to_user/copy_from_user using VMX). I noticed memcpy issues when profiling a RAID6 workload: .memcpy .async_memcpy .async_copy_data .__raid_run_ops .handle_stripe .raid5d .md_thread I created a simplified testcase by building a RAID6 array with 4 1GB ramdisks (booting with brd.rd_size=1048576): # mdadm -CR -e 1.2 /dev/md0 --level=6 -n4 /dev/ram[0-3] I then timed how long it took to write to the entire array: # dd if=/dev/zero of=/dev/md0 bs=1M Before: 892 MB/s After: 999 MB/s A 12% improvement. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_userAnton Blanchard
Version 2.06 of the POWER ISA introduced enhanced touch instructions, allowing us to specify a number of attributes including the length of a stream. This patch adds a software stream for both loads and stores in the POWER7 copy_tofrom_user loop. Since the setup is quite complicated and we have to use an eieio to ensure correct ordering of the "GO" command we only do this for copies above 4kB. To quantify any performance improvements we need a working set bigger than the caches so we operate on a 1GB file: # dd if=/dev/zero of=/tmp/foo bs=1M count=1024 And we compare how fast we can read the file: # dd if=/tmp/foo of=/dev/null bs=1M before: 7.7 GB/s after: 9.6 GB/s A 25% improvement. The worst case for this patch will be a completely L1 cache contained copy of just over 4kB. We can test this with the copy_to_user testcase we used to tune copy_tofrom_user originally: http://ozlabs.org/~anton/junkcode/copy_to_user.c # time ./copy_to_user2 -l 4224 -i 10000000 before: 6.807 s after: 6.946 s A 2% slowdown, which seems reasonable considering our data is unlikely to be completely L1 contained. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/pci: cleanup on duplicate assignmentGavin Shan
While creating the PCI root bus through function pci_create_root_bus() of PCI core, it should have assigned the secondary bus number for the newly created PCI root bus. Thus we needn't do the explicit assignment for the secondary bus number again in pcibios_scan_phb(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/numa: Fix OF node refcounting bugGavin Shan
The form affinity for NUMA is set to 1 if the firmware supports OPAL. Otherwise, we have to retrieve that from OF node "/chosen". For the latter case, OF node "/chosen" reference count was never decreased. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: POWER7 optimised copy_page using VMX and enhanced prefetchAnton Blanchard
Implement a POWER7 optimised copy_page using VMX and enhanced prefetch instructions. We use enhanced prefetch hints to prefetch both the load and store side. We copy a cacheline at a time and fall back to regular loads and stores if we are unable to use VMX (eg we are in an interrupt). The following microbenchmark was used to assess the impact of the patch: http://ozlabs.org/~anton/junkcode/page_fault_file.c We test MAP_PRIVATE page faults across a 1GB file, 100 times: # time ./page_fault_file -p -l 1G -i 100 Before: 22.25s After: 18.89s 17% faster Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Rename copyuser_power7_vmx.c to vmx-helper.cAnton Blanchard
Subsequent patches will add more VMX library functions and it makes sense to keep all the c-code helper functions in the one file. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Clear RI and EE at the same time in system call exitAnton Blanchard
mtmsrd is an expensive instruction, we save a few cycles by doing it once instead of twice. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_userAnton Blanchard
Version 2.06 of the POWER ISA introduced enhanced touch instructions, allowing us to specify a number of attributes including the length of a stream. This patch adds a software stream for both loads and stores in the POWER7 copy_tofrom_user loop. Since the setup is quite complicated and we have to use an eieio to ensure correct ordering of the "GO" command we only do this for copies above 4kB. To quantify any performance improvements we need a working set bigger than the caches so we operate on a 1GB file: # dd if=/dev/zero of=/tmp/foo bs=1M count=1024 And we compare how fast we can read the file: # dd if=/tmp/foo of=/dev/null bs=1M before: 7.7 GB/s after: 9.6 GB/s A 25% improvement. The worst case for this patch will be a completely L1 cache contained copy of just over 4kB. We can test this with the copy_to_user testcase we used to tune copy_tofrom_user originally: http://ozlabs.org/~anton/junkcode/copy_to_user.c # time ./copy_to_user2 -l 4224 -i 10000000 before: 6.807 s after: 6.946 s A 2% slowdown, which seems reasonable considering our data is unlikely to be completely L1 contained. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/smp: remove call to ipi_call_lock()/ipi_call_unlock()Yong Zhang
1) call_function.lock used in smp_call_function_many() is just to protect call_function.queue and &data->refs, cpu_online_mask is outside of the lock. And it's not necessary to protect cpu_online_mask, because data->cpumask is pre-calculate and even if a cpu is brougt up when calling arch_send_call_function_ipi_mask(), it's harmless because validation test in generic_smp_call_function_interrupt() will take care of it. 2) For cpu down issue, stop_machine() will guarantee that no concurrent smp_call_fuction() is processing. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: 64bit optimised __clear_userAnton Blanchard
I noticed __clear_user high up in a profile of one of my RAID stress tests. The testcase was doing a dd from /dev/zero which ends up calling __clear_user. __clear_user is basically a loop with a single 4 byte store which is horribly slow. We can do much better by aligning the desination and doing 32 bytes of 8 byte stores in a loop. The following testcase was used to verify the patch: http://ozlabs.org/~anton/junkcode/stress_clear_user.c To show the improvement in performance I ran a dd from /dev/zero to /dev/null on a POWER7 box: Before: # dd if=/dev/zero of=/dev/null bs=1M count=10000 10485760000 bytes (10 GB) copied, 3.72379 s, 2.8 GB/s After: # time dd if=/dev/zero of=/dev/null bs=1M count=10000 10485760000 bytes (10 GB) copied, 0.728318 s, 14.4 GB/s Over 5x faster. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: tracing: Avoid tracepoint duplication with DECLARE_EVENT_CLASSAnton Blanchard
irq_entry, irq_exit, timer_interrupt_entry and timer_interrupt_exit all do the same thing so use DECLARE_EVENT_CLASS to avoid duplicating everything 4 times. This saves quite a lot of space in both instruction text and data: text data bss dec hex filename 9265 19622 16 28903 70e7 arch/powerpc/kernel/irq.o 6817 19019 16 25852 64fc arch/powerpc/kernel/irq.o Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>