Age | Commit message (Collapse) | Author |
|
Convert KVM to use the MIPS_ENTRYLO_* definitions from <asm/mipsregs.h>
rather than custom definitions in kvm_host.h
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Simplify some of the TLB_ macros making use of the arrayification of
tlb_lo. Basically we index the array by the bit of the virtual address
which determines whether the even or odd entry is used, instead of
having a conditional.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The values of the EntryLo0 and EntryLo1 registers for a TLB entry are
stored in separate members of struct kvm_mips_tlb called tlb_lo0 and
tlb_lo1 respectively. To allow future code which needs to manipulate
arbitrary EntryLo data in the TLB entry to be simpler and less
conditional, replace these members with an array of two elements.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The host kernel's exception vector base address is currently saved in
the VCPU structure at creation time, and restored on a guest exit.
However it doesn't change and can already be easily accessed from the
'ebase' variable (arch/mips/kernel/traps.c), so drop the host_ebase
member of kvm_vcpu_arch, export the 'ebase' variable to modules and load
from there instead.
This does result in a single extra instruction (lui) on the guest exit
path, but simplifies the code a bit and removes the redundant storage of
the host exception base address.
Credit for the idea goes to Cavium's VZ KVM implementation.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The function kvm_mips_handle_mapped_seg_tlb_fault() has two completely
unused pointer arguments, hpa0 and hpa1, for which all users always pass
NULL.
Drop these two arguments and update the callers.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Several KVM module functions are indirected so that they can be accessed
from tlb.c which is statically built into the kernel. This is no longer
necessary as the relevant bits of code have moved into mmu.c which is
part of the KVM module, so drop the indirections.
Note: is_error_pfn() is defined inline in kvm_host.h, so didn't actually
require the KVM module to be loaded for it to work anyway.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Various functions in tlb.c perform higher level MMU handling, but don't
strictly need to be statically built into the kernel as they don't
directly manipulate TLB entries. Move these functions out into a
separate mmu.c which will be built into the KVM kernel module. This
allows them to directly reference KVM functions in the KVM kernel module
in future.
Module exports of these functions have been removed, since they aren't
needed outside of KVM.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The CP0 Cause register is passed around in KVM quite a bit, often as an
unsigned long, even though it is always 32-bits long.
Resize it to u32 throughout MIPS KVM.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Convert the MIPS kvm_host.h structs, function declaration prototypes and
associated definition prototypes to use standard kernel sized types
(e.g. u32) instead of inttypes.h style ones (e.g. uint32_t).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The host EntryHi in the KVM VCPU context is virtually unused. It gets
stored on exceptions, but only ever used in a kvm_debug() when a TLB
miss occurs.
Drop it entirely, removing that information from the kvm_debug output.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The MIPS kvm_vcpu_arch::guest_inst isn't used, so drop it from the
struct and drop its asm-offsets definition.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When faulting guest addresses are matched against guest segments with
the KVM_GUEST_KSEGX() macro, change the mask to 0xe0000000 so as to
include bit 31.
This is mainly for safety's sake, as it prevents a rogue BadVAddr in the
host kseg2/kseg3 segments (e.g. 0xC*******) after a TLB exception from
matching the guest kseg0 segment (e.g. 0x4*******), triggering an
internal KVM error instead of allowing the corresponding guest kseg0
page to be mapped into the host vmalloc space.
Such a rogue BadVAddr was observed to happen with the host MIPS kernel
running under QEMU with KVM built as a module, due to a not entirely
transparent optimisation in the QEMU TLB handling. This has already been
worked around properly in a previous commit.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
get a TLB refill exception in it when KVM is built as a module.
This was observed to happen with the host MIPS kernel running under
QEMU, due to a not entirely transparent optimisation in the QEMU TLB
handling where TLB entries replaced with TLBWR are copied to a separate
part of the TLB array. Code in those pages continue to be executable,
but those mappings persist only until the next ASID switch, even if they
are marked global.
An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
switching to the guest exception base. Subsequent TLB mapped kernel
instructions just prior to switching to the guest trigger a TLB refill
exception, which enters the guest exception handlers without updating
EPC. This appears as a guest triggered TLB refill on a host kernel
mapped (host KSeg2) address, which is not handled correctly as user
(guest) mode accesses to kernel (host) segments always generate address
error exceptions.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pull more MIPS updates from Ralf Baechle:
"This is the secondnd batch of MIPS patches for 4.7. Summary:
CPS:
- Copy EVA configuration when starting secondary VPs.
EIC:
- Clear Status IPL.
Lasat:
- Fix a few off by one bugs.
lib:
- Mark intrinsics notrace. Not only are the intrinsics
uninteresting, it would cause infinite recursion.
MAINTAINERS:
- Add file patterns for MIPS BRCM device tree bindings.
- Add file patterns for mips device tree bindings.
MT7628:
- Fix MT7628 pinmux typos.
- wled_an pinmux gpio.
- EPHY LEDs pinmux support.
Pistachio:
- Enable KASLR
VDSO:
- Build microMIPS VDSO for microMIPS kernels.
- Fix aliasing warning by building with `-fno-strict-aliasing' for
debugging but also tracing them might result in recursion.
Misc:
- Add missing FROZEN hotplug notifier transitions.
- Fix clk binding example for varioius PIC32 devices.
- Fix cpu interrupt controller node-names in the DT files.
- Fix XPA CPU feature separation.
- Fix write_gc0_* macros when writing zero.
- Add inline asm encoding helpers.
- Add missing VZ accessor microMIPS encodings.
- Fix little endian microMIPS MSA encodings.
- Add 64-bit HTW fields and fix its configuration.
- Fix sigreturn via VDSO on microMIPS kernel.
- Lots of typo fixes.
- Add definitions of SegCtl registers and use them"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (49 commits)
MIPS: Add missing FROZEN hotplug notifier transitions
MIPS: Build microMIPS VDSO for microMIPS kernels
MIPS: Fix sigreturn via VDSO on microMIPS kernel
MIPS: devicetree: fix cpu interrupt controller node-names
MIPS: VDSO: Build with `-fno-strict-aliasing'
MIPS: Pistachio: Enable KASLR
MIPS: lib: Mark intrinsics notrace
MIPS: Fix 64-bit HTW configuration
MIPS: Add 64-bit HTW fields
MAINTAINERS: Add file patterns for mips device tree bindings
MAINTAINERS: Add file patterns for mips brcm device tree bindings
MIPS: Simplify DSP instruction encoding macros
MIPS: Add missing tlbinvf/XPA microMIPS encodings
MIPS: Fix little endian microMIPS MSA encodings
MIPS: Add missing VZ accessor microMIPS encodings
MIPS: Add inline asm encoding helpers
MIPS: Spelling fix lets -> let's
MIPS: VR41xx: Fix typo
MIPS: oprofile: Fix typo
MIPS: math-emu: Fix typo
...
|
|
Add field definitions for some of the 64-bit specific Hardware page
Table Walker (HTW) register fields in PWSize and PWCtl, in preparation
for fixing the 64-bit HTW configuration.
Also print these fields out along with the others in print_htw_config().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13363/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Simplify the DSP instruction wrapper macros which use explicit encodings
for microMIPS and normal MIPS by using the new encoding macros and
removing duplication.
To me this makes it easier to read since it is much shorter, but it also
ensures .insn is used, preventing objdump disassembling the microMIPS
code as normal MIPS.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13314/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Hardcoded MIPS instruction encodings are provided for tlbinvf, mfhc0 &
mthc0 instructions, but microMIPS encodings are missing. I doubt any
microMIPS cores exist at present which support these instructions, but
the microMIPS encodings exist, and microMIPS cores may support them in
the future. Add the missing microMIPS encodings using the new macros.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13313/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
When the toolchain doesn't support MSA we encode MSA instructions
explicitly in assembly. Unfortunately we use .word for both MIPS and
microMIPS encodings which is wrong, since 32-bit microMIPS instructions
are made up from a pair of halfwords.
- The most significant halfword always comes first, so for little endian
builds the halves will be emitted in the wrong order.
- 32-bit alignment isn't guaranteed, so the assembler may insert a
16-bit nop instruction to pad the instruction stream to a 32-bit
boundary.
Use the new instruction encoding macros to encode microMIPS MSA
instructions correctly.
Fixes: d96cc3d1ec5d ("MIPS: Add microMIPS MSA support.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <Paul.Burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13312/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Toolchains may be used which support microMIPS but not VZ instructions
(i.e. binutis 2.22 & 2.23), so extend the explicitly encoded versions of
the guest COP0 register & guest TLB access macros to support microMIPS
encodings too, using the new macros.
This prevents non-microMIPS instructions being executed in microMIPS
mode during CPU probe on cores supporting VZ (e.g. M5150), which cause
reserved instruction exceptions early during boot.
Fixes: bad50d79255a ("MIPS: Fix VZ probe gas errors with binutils <2.24")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13311/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
To allow simplification of macros which use inline assembly to
explicitly encode instructions, add a few simple abstractions to
mipsregs.h which expand to specific microMIPS or normal MIPS encodings
depending on what type of kernel is being built:
_ASM_INSN_IF_MIPS(_enc) : Emit a 32bit MIPS instruction if microMIPS is
not enabled.
_ASM_INSN32_IF_MM(_enc) : Emit a 32bit microMIPS instruction if enabled.
_ASM_INSN16_IF_MM(_enc) : Emit a 16bit microMIPS instruction if enabled.
The macros can be used one after another since the MIPS / microMIPS
macros are mutually exclusive, for example:
__asm__ __volatile__(
".set push\n\t"
".set noat\n\t"
"# mfgc0 $1, $%1, %2\n\t"
_ASM_INSN_IF_MIPS(0x40610000 | %1 << 11 | %2)
_ASM_INSN32_IF_MM(0x002004fc | %1 << 16 | %2 << 11)
"move %0, $1\n\t"
".set pop"
: "=r" (__res)
: "i" (source), "i" (sel));
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13310/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
As noticed by Sergei in the discussion of Andrea Gelmini's patch series.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: adam.buchbinder@gmail.com
Cc: trivial@kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13328/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: david.daney@cavium.com
Cc: janne.huttunen@nokia.com
Cc: aaro.koskinen@nokia.com
Cc: trivial@kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13324/
Patchwork: https://patchwork.linux-mips.org/patch/13325/
Patchwork: https://patchwork.linux-mips.org/patch/13326/
Patchwork: https://patchwork.linux-mips.org/patch/13327/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: linux-mips@linux-mips.org
Cc: trivial@kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13323/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: chenhc@lemote.com
Cc: viresh.kumar@linaro.org
Cc: linux-mips@linux-mips.org
Cc: trivial@kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13322/
Patchwork: https://patchwork.linux-mips.org/patch/13332/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: trivial@kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13321/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: trivial@kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13320/
Patchwork: https://patchwork.linux-mips.org/patch/13335/
Patchwork: https://patchwork.linux-mips.org/patch/13336/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: trivial@kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13319/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: trivial@kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13318/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: chenhc@lemote.com
Cc: james.hogan@imgtec.com
Cc: linux-mips@linux-mips.org
Cc: trivial@kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13317/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The versions of the __write_{32,64}bit_gc0_register() macros for when
there is no virt support in the assembler use the "J" inline asm
constraint to allow integer zero, but this needs to be accompanied by
the "z" formatting string so that it turns into $0. Fix both macros to
do this.
Fixes: bad50d79255a ("MIPS: Fix VZ probe gas errors with binutils <2.24")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13289/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The SegCtl registers are standard for MIPSr3..MIPSr5. Add definitions of
these registers and use them rather than constants
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Joshua Kinard <kumba@gentoo.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Chris Packham <judge.packham@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13290/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Pull MTD updates from Brian Norris:
"First cycle with Boris as NAND maintainer! Many (most) bullets stolen
from him.
Generic:
- Migrated NAND LED trigger to be a generic MTD trigger
NAND:
- Introduction of the "ECC algorithm" concept, to avoid overloading
the ECC mode field too much more
- Replaced the nand_ecclayout infrastructure with something a little
more flexible (finally!) and future proof
- Rework of the OMAP GPMC and NAND drivers; the TI folks pulled some
of this into their own tree as well
- Prepare the sunxi NAND driver to receive DMA support
- Handle bitflips in erased pages on GPMI revisions that do not
support this in hardware.
SPI NOR:
- Start using the spi_flash_read() API for SPI drivers that support
it (i.e., SPI drivers with special memory-mapped flash modes)
And other small scattered improvments"
* tag 'for-linus-20160523' of git://git.infradead.org/linux-mtd: (155 commits)
mtd: spi-nor: support GigaDevice gd25lq64c
mtd: nand_bch: fix spelling of "probably"
mtd: brcmnand: respect ECC algorithm set by NAND subsystem
gpmi-nand: Handle ECC Errors in erased pages
Documentation: devicetree: deprecate "soft_bch" nand-ecc-mode value
mtd: nand: add support for "nand-ecc-algo" DT property
mtd: mtd: drop NAND_ECC_SOFT_BCH enum value
mtd: drop support for NAND_ECC_SOFT_BCH as "soft_bch" mapping
mtd: nand: read ECC algorithm from the new field
mtd: nand: fsmc: validate ECC setup by checking algorithm directly
mtd: nand: set ECC algorithm to Hamming on fallback
staging: mt29f_spinand: set ECC algorithm explicitly
CRIS v32: nand: set ECC algorithm explicitly
mtd: nand: atmel: set ECC algorithm explicitly
mtd: nand: davinci: set ECC algorithm explicitly
mtd: nand: bf5xx: set ECC algorithm explicitly
mtd: nand: omap2: Fix high memory dma prefetch transfer
mtd: nand: omap2: Start dma request before enabling prefetch
mtd: nandsim: add __init attribute
mtd: nand: move of_get_nand_xxx() helpers into nand_base.c
...
|
|
The binary GCD algorithm is based on the following facts:
1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)
Even on x86 machines with reasonable division hardware, the binary
algorithm runs about 25% faster (80% the execution time) than the
division-based Euclidian algorithm.
On platforms like Alpha and ARMv6 where division is a function call to
emulation code, it's even more significant.
There are two variants of the code here, depending on whether a fast
__ffs (find least significant set bit) instruction is available. This
allows the unpredictable branches in the bit-at-a-time shifting loop to
be eliminated.
If fast __ffs is not available, the "even/odd" GCD variant is used.
I use the following code to benchmark:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#define swap(a, b) \
do { \
a ^= b; \
b ^= a; \
a ^= b; \
} while (0)
unsigned long gcd0(unsigned long a, unsigned long b)
{
unsigned long r;
if (a < b) {
swap(a, b);
}
if (b == 0)
return a;
while ((r = a % b) != 0) {
a = b;
b = r;
}
return b;
}
unsigned long gcd1(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
b >>= __builtin_ctzl(b);
for (;;) {
a >>= __builtin_ctzl(a);
if (a == b)
return a << __builtin_ctzl(r);
if (a < b)
swap(a, b);
a -= b;
}
}
unsigned long gcd2(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
r &= -r;
while (!(b & r))
b >>= 1;
for (;;) {
while (!(a & r))
a >>= 1;
if (a == b)
return a;
if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}
unsigned long gcd3(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
b >>= __builtin_ctzl(b);
if (b == 1)
return r & -r;
for (;;) {
a >>= __builtin_ctzl(a);
if (a == 1)
return r & -r;
if (a == b)
return a << __builtin_ctzl(r);
if (a < b)
swap(a, b);
a -= b;
}
}
unsigned long gcd4(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
r &= -r;
while (!(b & r))
b >>= 1;
if (b == r)
return r;
for (;;) {
while (!(a & r))
a >>= 1;
if (a == r)
return r;
if (a == b)
return a;
if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}
static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
gcd0, gcd1, gcd2, gcd3, gcd4,
};
#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))
#if defined(__x86_64__)
#define rdtscll(val) do { \
unsigned long __a,__d; \
__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
} while(0)
static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
unsigned long a, unsigned long b, unsigned long *res)
{
unsigned long long start, end;
unsigned long long ret;
unsigned long gcd_res;
rdtscll(start);
gcd_res = gcd(a, b);
rdtscll(end);
if (end >= start)
ret = end - start;
else
ret = ~0ULL - start + 1 + end;
*res = gcd_res;
return ret;
}
#else
static inline struct timespec read_time(void)
{
struct timespec time;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
return time;
}
static inline unsigned long long diff_time(struct timespec start, struct timespec end)
{
struct timespec temp;
if ((end.tv_nsec - start.tv_nsec) < 0) {
temp.tv_sec = end.tv_sec - start.tv_sec - 1;
temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
} else {
temp.tv_sec = end.tv_sec - start.tv_sec;
temp.tv_nsec = end.tv_nsec - start.tv_nsec;
}
return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
}
static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
unsigned long a, unsigned long b, unsigned long *res)
{
struct timespec start, end;
unsigned long gcd_res;
start = read_time();
gcd_res = gcd(a, b);
end = read_time();
*res = gcd_res;
return diff_time(start, end);
}
#endif
static inline unsigned long get_rand()
{
if (sizeof(long) == 8)
return (unsigned long)rand() << 32 | rand();
else
return rand();
}
int main(int argc, char **argv)
{
unsigned int seed = time(0);
int loops = 100;
int repeats = 1000;
unsigned long (*res)[TEST_ENTRIES];
unsigned long long elapsed[TEST_ENTRIES];
int i, j, k;
for (;;) {
int opt = getopt(argc, argv, "n:r:s:");
/* End condition always first */
if (opt == -1)
break;
switch (opt) {
case 'n':
loops = atoi(optarg);
break;
case 'r':
repeats = atoi(optarg);
break;
case 's':
seed = strtoul(optarg, NULL, 10);
break;
default:
/* You won't actually get here. */
break;
}
}
res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
memset(elapsed, 0, sizeof(elapsed));
srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
/* Do we have args? */
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
unsigned long long min_elapsed[TEST_ENTRIES];
for (k = 0; k < repeats; k++) {
for (i = 0; i < TEST_ENTRIES; i++) {
unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
if (k == 0 || min_elapsed[i] > tmp)
min_elapsed[i] = tmp;
}
}
for (i = 0; i < TEST_ENTRIES; i++)
elapsed[i] += min_elapsed[i];
}
for (i = 0; i < TEST_ENTRIES; i++)
printf("gcd%d: elapsed %llu\n", i, elapsed[i]);
k = 0;
srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
for (i = 1; i < TEST_ENTRIES; i++) {
if (res[j][i] != res[j][0])
break;
}
if (i < TEST_ENTRIES) {
if (k == 0) {
k = 1;
fprintf(stderr, "Error:\n");
}
fprintf(stderr, "gcd(%lu, %lu): ", a, b);
for (i = 0; i < TEST_ENTRIES; i++)
fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
}
}
if (k == 0)
fprintf(stderr, "PASS\n");
free(res);
return 0;
}
Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 10174
gcd1: elapsed 2120
gcd2: elapsed 2902
gcd3: elapsed 2039
gcd4: elapsed 2812
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9309
gcd1: elapsed 2280
gcd2: elapsed 2822
gcd3: elapsed 2217
gcd4: elapsed 2710
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9589
gcd1: elapsed 2098
gcd2: elapsed 2815
gcd3: elapsed 2030
gcd4: elapsed 2718
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9914
gcd1: elapsed 2309
gcd2: elapsed 2779
gcd3: elapsed 2228
gcd4: elapsed 2709
PASS
[akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
Signed-off-by: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Merge updates from Andrew Morton:
- fsnotify fix
- poll() timeout fix
- a few scripts/ tweaks
- debugobjects updates
- the (small) ocfs2 queue
- Minor fixes to kernel/padata.c
- Maybe half of the MM queue
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
mm, page_alloc: restore the original nodemask if the fast path allocation failed
mm, page_alloc: uninline the bad page part of check_new_page()
mm, page_alloc: don't duplicate code in free_pcp_prepare
mm, page_alloc: defer debugging checks of pages allocated from the PCP
mm, page_alloc: defer debugging checks of freed pages until a PCP drain
cpuset: use static key better and convert to new API
mm, page_alloc: inline pageblock lookup in page free fast paths
mm, page_alloc: remove unnecessary variable from free_pcppages_bulk
mm, page_alloc: pull out side effects from free_pages_check
mm, page_alloc: un-inline the bad part of free_pages_check
mm, page_alloc: check multiple page fields with a single branch
mm, page_alloc: remove field from alloc_context
mm, page_alloc: avoid looking up the first zone in a zonelist twice
mm, page_alloc: shortcut watermark checks for order-0 pages
mm, page_alloc: reduce cost of fair zone allocation policy retry
mm, page_alloc: shorten the page allocator fast path
mm, page_alloc: check once if a zone has isolated pageblocks
mm, page_alloc: move __GFP_HARDWALL modifications out of the fastpath
mm, page_alloc: simplify last cpupid reset
mm, page_alloc: remove unnecessary initialisation from __alloc_pages_nodemask()
...
|
|
I've just discovered that the useful-sounding has_transparent_hugepage()
is actually an architecture-dependent minefield: on some arches it only
builds if CONFIG_TRANSPARENT_HUGEPAGE=y, on others it's also there when
not, but on some of those (arm and arm64) it then gives the wrong
answer; and on mips alone it's marked __init, which would crash if
called later (but so far it has not been called later).
Straighten this out: make it available to all configs, with a sensible
default in asm-generic/pgtable.h, removing its definitions from those
arches (arc, arm, arm64, sparc, tile) which are served by the default,
adding #define has_transparent_hugepage has_transparent_hugepage to
those (mips, powerpc, s390, x86) which need to override the default at
runtime, and removing the __init from mips (but maybe that kind of code
should be avoided after init: set a static variable the first time it's
called).
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Yang Shi <yang.shi@linaro.org>
Cc: Ning Qu <quning@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [arch/s390]
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull KVM updates from Paolo Bonzini:
"Small release overall.
x86:
- miscellaneous fixes
- AVIC support (local APIC virtualization, AMD version)
s390:
- polling for interrupts after a VCPU goes to halted state is now
enabled for s390
- use hardware provided information about facility bits that do not
need any hypervisor activity, and other fixes for cpu models and
facilities
- improve perf output
- floating interrupt controller improvements.
MIPS:
- miscellaneous fixes
PPC:
- bugfixes only
ARM:
- 16K page size support
- generic firmware probing layer for timer and GIC
Christoffer Dall (KVM-ARM maintainer) says:
"There are a few changes in this pull request touching things
outside KVM, but they should all carry the necessary acks and it
made the merge process much easier to do it this way."
though actually the irqchip maintainers' acks didn't make it into the
patches. Marc Zyngier, who is both irqchip and KVM-ARM maintainer,
later acked at http://mid.gmane.org/573351D1.4060303@arm.com ('more
formally and for documentation purposes')"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (82 commits)
KVM: MTRR: remove MSR 0x2f8
KVM: x86: make hwapic_isr_update and hwapic_irr_update look the same
svm: Manage vcpu load/unload when enable AVIC
svm: Do not intercept CR8 when enable AVIC
svm: Do not expose x2APIC when enable AVIC
KVM: x86: Introducing kvm_x86_ops.apicv_post_state_restore
svm: Add VMEXIT handlers for AVIC
svm: Add interrupt injection via AVIC
KVM: x86: Detect and Initialize AVIC support
svm: Introduce new AVIC VMCB registers
KVM: split kvm_vcpu_wake_up from kvm_vcpu_kick
KVM: x86: Introducing kvm_x86_ops VCPU blocking/unblocking hooks
KVM: x86: Introducing kvm_x86_ops VM init/destroy hooks
KVM: x86: Rename kvm_apic_get_reg to kvm_lapic_get_reg
KVM: x86: Misc LAPIC changes to expose helper functions
KVM: shrink halt polling even more for invalid wakeups
KVM: s390: set halt polling to 80 microseconds
KVM: halt_polling: provide a way to qualify wakeups during poll
KVM: PPC: Book3S HV: Re-enable XICS fast path for irqfd-generated interrupts
kvm: Conditionally register IRQ bypass consumer
...
|
|
The VZ guest register & TLB access macros introduced in commit "MIPS:
Add guest CP0 accessors" use VZ ASE specific instructions that aren't
understood by versions of binutils prior to 2.24.
Add a check for whether the toolchain supports the -mvirt option,
similar to the MSA toolchain check, and implement the accessors using
.word if not.
Due to difficulty in converting compiler specified registers (e.g. "$3")
to usable numbers (e.g. "3") in inline asm, we need to copy to/from a
temporary register, namely the assembler temporary (at/$1), and specify
guest CP0 registers numerically in the gc0 macros.
Fixes: 7eb91118227d ("MIPS: Add guest CP0 accessors")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-next@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13255/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Fix a build regression from commit c9017757c532 ("MIPS: init upper 64b
of vector registers when MSA is first used"):
arch/mips/built-in.o: In function `enable_restore_fp_context':
traps.c:(.text+0xbb90): undefined reference to `_init_msa_upper'
traps.c:(.text+0xbb90): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper'
traps.c:(.text+0xbef0): undefined reference to `_init_msa_upper'
traps.c:(.text+0xbef0): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper'
to !CONFIG_CPU_HAS_MSA configurations with older GCC versions, which are
unable to figure out that calls to `_init_msa_upper' are indeed dead.
Of the many ways to tackle this failure choose the approach we have
already taken in `thread_msa_context_live'.
[ralf@linux-mips.org: Drop patch segment to junk file.]
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: stable@vger.kernel.org # v3.16+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13271/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Fix mips_cm_lock_other compilation error when MIPS_CM is not selected.
This was introduced in commit 23d5de8efb9a (MIPS: CM: Introduce core-other
locking functions)
Signed-off-by: Tony Wu <tung7970@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11698/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The DT fragment will select the ohci-platform driver, since that can
handle the JZ4740 OHCI just fine. While I don't have a JZ4740-based
board with anything connected to the USB host controller, I did test
the generic OHCI driver successfully on a JZ4770-based board.
The device is disabled by default; boards that want to use it can
override the "status" property. The mass-production Qi LB60 boards
don't use the USB host controller.
Signed-off-by: Maarten ter Huurne <maarten@treewalker.org>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13104/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Some wakeups should not be considered a sucessful poll. For example on
s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
would be considered runnable - letting all vCPUs poll all the time for
transactional like workload, even if one vCPU would be enough.
This can result in huge CPU usage for large guests.
This patch lets architectures provide a way to qualify wakeups if they
should be considered a good/bad wakeups in regard to polls.
For s390 the implementation will fence of halt polling for anything but
known good, single vCPU events. The s390 implementation for floating
interrupts does a wakeup for one vCPU, but the interrupt will be delivered
by whatever CPU checks first for a pending interrupt. We prefer the
woken up CPU by marking the poll of this CPU as "good" poll.
This code will also mark several other wakeup reasons like IPI or
expired timers as "good". This will of course also mark some events as
not sucessful. As KVM on z runs always as a 2nd level hypervisor,
we prefer to not poll, unless we are really sure, though.
This patch successfully limits the CPU usage for cases like uperf 1byte
transactional ping pong workload or wakeup heavy workload like OLTP
while still providing a proper speedup.
This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
wakeups that are considered not good for polling.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
Cc: David Matlack <dmatlack@google.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
[Rename config symbol. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
the memory type
Mediatek MT7620 SoC has syscfg0 bits where it sets the type of memory being used.
However, sometimes those bits are not set properly (reading "11"). In this case, the SoC assumes SDRAM.
The patch below reflects that.
Signed-off-by: Sashka Nochkin <linux-mips@durdom.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13135/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Remove a duplicate o32 `elf_check_arch' implementation, move all macro
variants to <asm/elf.h> and define them unconditionally under indvidual
names, substituting alias `elf_check_arch' definitions in variant code.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13245/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13244/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Move the `mips_elf_abiflags_v0' structure and FP ABI flag macros outside
#ifndef ELF_ARCH. These are public interfaces.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13243/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Add a few new cpu-features.h definitions for VZ sub-features, namely the
existence of the CP0_GuestCtl0Ext, CP0_GuestCtl1, and CP0_GuestCtl2
registers, and support for GuestID to dialias TLB entries belonging to
different guests.
Also add certain features present in the guest, with the naming scheme
cpu_guest_has_*. These are added separately to the main options bitfield
since they generally parallel similar features in the root context. A
few of these (FPU, MSA, watchpoints, perf counters, CP0_[X]ContextConfig
registers, MAAR registers, and probably others in future) can be
dynamically configured in the guest context, for which the
cpu_guest_has_dyn_* macros are added.
[ralf@linux-mips.org: Resolve merge conflict.]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13231/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Add guest CP0 accessors and guest TLB operations along the same lines as
the existing macros and functions for the root CP0.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13229/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Add various register definitions to <asm/mipsregs.h> for the coprocessor
zero registers in the VZ ASE, namely CP0_GuestCtl0, CP0_GuestCtl0Ext,
CP0_GuestCtl1, CP0_GuestCtl2, CP0_GuestCtl3, and CP0_GTOffset.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13228/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The decode_config4() function reads kscratch_mask from
CP0_Config4.KScrExist using a hard coded shift and mask. We already have
a definition for the mask in mipsregs.h, so add a definition for the
shift and make use of them.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13227/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|