Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system
Pull "Disintegrate and delete asm/system.h" from David Howells:
"Here are a bunch of patches to disintegrate asm/system.h into a set of
separate bits to relieve the problem of circular inclusion
dependencies.
I've built all the working defconfigs from all the arches that I can
and made sure that they don't break.
The reason for these patches is that I recently encountered a circular
dependency problem that came about when I produced some patches to
optimise get_order() by rewriting it to use ilog2().
This uses bitops - and on the SH arch asm/bitops.h drags in
asm-generic/get_order.h by a circuituous route involving asm/system.h.
The main difficulty seems to be asm/system.h. It holds a number of
low level bits with no/few dependencies that are commonly used (eg.
memory barriers) and a number of bits with more dependencies that
aren't used in many places (eg. switch_to()).
These patches break asm/system.h up into the following core pieces:
(1) asm/barrier.h
Move memory barriers here. This already done for MIPS and Alpha.
(2) asm/switch_to.h
Move switch_to() and related stuff here.
(3) asm/exec.h
Move arch_align_stack() here. Other process execution related bits
could perhaps go here from asm/processor.h.
(4) asm/cmpxchg.h
Move xchg() and cmpxchg() here as they're full word atomic ops and
frequently used by atomic_xchg() and atomic_cmpxchg().
(5) asm/bug.h
Move die() and related bits.
(6) asm/auxvec.h
Move AT_VECTOR_SIZE_ARCH here.
Other arch headers are created as needed on a per-arch basis."
Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that. We'll find out anything that got broken and fix it..
* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
Delete all instances of asm/system.h
Remove all #inclusions of asm/system.h
Add #includes needed to permit the removal of asm/system.h
Move all declarations of free_initmem() to linux/mm.h
Disintegrate asm/system.h for OpenRISC
Split arch_align_stack() out from asm-generic/system.h
Split the switch_to() wrapper out of asm-generic/system.h
Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
Create asm-generic/barrier.h
Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
Disintegrate asm/system.h for Xtensa
Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
Disintegrate asm/system.h for Tile
Disintegrate asm/system.h for Sparc
Disintegrate asm/system.h for SH
Disintegrate asm/system.h for Score
Disintegrate asm/system.h for S390
Disintegrate asm/system.h for PowerPC
Disintegrate asm/system.h for PA-RISC
Disintegrate asm/system.h for MN10300
...
|
|
Pull kvm updates from Avi Kivity:
"Changes include timekeeping improvements, support for assigning host
PCI devices that share interrupt lines, s390 user-controlled guests, a
large ppc update, and random fixes."
This is with the sign-off's fixed, hopefully next merge window we won't
have rebased commits.
* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
KVM: Convert intx_mask_lock to spin lock
KVM: x86: fix kvm_write_tsc() TSC matching thinko
x86: kvmclock: abstract save/restore sched_clock_state
KVM: nVMX: Fix erroneous exception bitmap check
KVM: Ignore the writes to MSR_K7_HWCR(3)
KVM: MMU: make use of ->root_level in reset_rsvds_bits_mask
KVM: PMU: add proper support for fixed counter 2
KVM: PMU: Fix raw event check
KVM: PMU: warn when pin control is set in eventsel msr
KVM: VMX: Fix delayed load of shared MSRs
KVM: use correct tlbs dirty type in cmpxchg
KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
KVM: Ensure all vcpus are consistent with in-kernel irqchip settings
KVM: x86 emulator: Allow PM/VM86 switch during task switch
KVM: SVM: Fix CPL updates
KVM: x86 emulator: VM86 segments must have DPL 3
KVM: x86 emulator: Fix task switch privilege checks
arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice
KVM: x86 emulator: correctly mask pmc index bits in RDPMC instruction emulation
KVM: mmu_notifier: Flush TLBs before releasing mmu_lock
...
|
|
Delete all instances of asm/system.h as they should be redundant by this
point.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Move all declarations of free_initmem() to linux/mm.h so that there's only one
and it's used by everything.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-c6x-dev@linux-c6x.org
cc: microblaze-uclinux@itee.uq.edu.au
cc: linux-sh@vger.kernel.org
cc: sparclinux@vger.kernel.org
cc: x86@kernel.org
cc: linux-mm@kvack.org
|
|
Disintegrate asm/system.h for X86.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
cc: x86@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull <linux/bug.h> cleanup from Paul Gortmaker:
"The changes shown here are to unify linux's BUG support under the one
<linux/bug.h> file. Due to historical reasons, we have some BUG code
in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in
linux/kernel.h predates the addition of linux/bug.h, but old code in
kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h
was including <asm/bug.h> to pseudo link them.
This has caused confusion[1] and general yuck/WTF[2] reactions. Here
is an example that violates the principle of least surprise:
CC lib/string.o
lib/string.c: In function 'strlcat':
lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
make[2]: *** [lib/string.o] Error 1
$
$ grep linux/bug.h lib/string.c
#include <linux/bug.h>
$
We've included <linux/bug.h> for the BUG infrastructure and yet we
still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh -
very confusing for someone who is new to kernel development.
With the above in mind, the goals of this changeset are:
1) find and fix any include/*.h files that were relying on the
implicit presence of BUG code.
2) find and fix any C files that were consuming kernel.h and hence
relying on implicitly getting some/all BUG code.
3) Move the BUG related code living in kernel.h to <linux/bug.h>
4) remove the asm/bug.h from kernel.h to finally break the chain.
During development, the order was more like 3-4, build-test, 1-2. But
to ensure that git history for bisect doesn't get needless build
failures introduced, the commits have been reorderd to fix the problem
areas in advance.
[1] https://lkml.org/lkml/2012/1/3/90
[2] https://lkml.org/lkml/2012/1/17/414"
Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul
and linux-next.
* tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
kernel.h: doesn't explicitly use bug.h, so don't include it.
bug: consolidate BUILD_BUG_ON with other bug code
BUG: headers with BUG/BUG_ON etc. need linux/bug.h
bug.h: add include of it to various implicit C users
lib: fix implicit users of kernel.h for TAINT_WARN
spinlock: macroize assert_spin_locked to avoid bug.h dependency
x86: relocate get/set debugreg fcns to include/asm/debugreg.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb
Pull KGDB/KDB updates from Jason Wessel:
"Fixes:
- Fix KDB keyboard repeat scan codes and leaked keyboard events
- Fix kernel crash with kdb_printf() for users who compile new
kdb_printf()'s in early code
- Return all segment registers to gdb on x86_64
Features:
- KDB/KGDB hook the reboot notifier and end user can control if it
stops, detaches or does nothing (updated docs as well)
- Notify users who use CONFIG_DEBUG_RODATA to use hw breakpoints"
* tag 'for_linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
kdb: Add message about CONFIG_DEBUG_RODATA on failure to install breakpoint
kdb: Avoid using dbg_io_ops until it is initialized
kgdb,debug_core: add the ability to control the reboot notifier
KDB: Fix usability issues relating to the 'enter' key.
kgdb,debug-core,gdbstub: Hook the reboot notifier for debugger detach
kgdb: Respect that flush op is optional
kgdb: x86: Return all segment registers also in 64-bit mode
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen updates from Konrad Rzeszutek Wilk:
"which has three neat features:
- PV multiconsole support, so that there can be hvc1, hvc2, etc; This
can be used in HVM and in PV mode.
- P-state and C-state power management driver that uploads said power
management data to the hypervisor. It also inhibits cpufreq
scaling drivers to load so that only the hypervisor can make power
management decisions - fixing a weird perf bug.
There is one thing in the Kconfig that you won't like: "default y
if (X86_ACPI_CPUFREQ = y || X86_POWERNOW_K8 = y)" (note, that it
all depends on CONFIG_XEN which depends on CONFIG_PARAVIRT which by
default is off). I've a fix to convert that boolean expression
into "default m" which I am going to post after the cpufreq git
pull - as the two patches to make this work depend on a fix in Dave
Jones's tree.
- Function Level Reset (FLR) support in the Xen PCI backend.
Fixes:
- Kconfig dependencies for Xen PV keyboard and video
- Compile warnings and constify fixes
- Change over to use percpu_xxx instead of this_cpu_xxx"
Fix up trivial conflicts in drivers/tty/hvc/hvc_xen.c due to changes to
a removed commit.
* tag 'stable/for-linus-3.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen kconfig: relax INPUT_XEN_KBDDEV_FRONTEND deps
xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.
xen: constify all instances of "struct attribute_group"
xen/xenbus: ignore console/0
hvc_xen: introduce HVC_XEN_FRONTEND
hvc_xen: implement multiconsole support
hvc_xen: support PV on HVM consoles
xenbus: don't free other end details too early
xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.
xen/setup/pm/acpi: Remove the call to boot_option_idle_override.
xenbus: address compiler warnings
xen: use this_cpu_xxx replace percpu_xxx funcs
xen/pciback: Support pci_reset_function, aka FLR or D3 support.
pci: Introduce __pci_reset_function_locked to be used when holding device_lock.
xen: Utilize the restore_msi_irqs hook.
|
|
Even if the content is always 0, gdb expects us to return also ds,
es, fs, and gs while in x86-64 mode. Do this to avoid ugly errors on
"info registers".
[jason.wessel@windriver.com: adjust NUMREGBYTES for two new regs]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform changes from Ingo Molnar.
Removes the Moorestown platform that nobody ever used.
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform: Move APIC ID validity check into platform APIC code
x86/olpc/xo15/sci: Enable lid close wakeup control
x86/geode/net5501: Add platform driver for Soekris Engineering net5501
x86/geode/alix2: Supplement driver to include GPIO button support
x86/mid/powerbtn: Use MSIC read/write instead of ipc_scu
x86/mid/thermal: Turn off thermistor
x86/mid/thermal: Add msic_thermal alias
x86/mid/thermal: Convert to use Intel MSIC API
x86/mid/scu_ipc: Remove Moorestown support
x86/mid: Kill off Moorestown
x86/mrst: Add msic_thermal platform support
x86/config: Select MSIC MFD driver on Intel Medfield platform
x86/mid: Remove Intel Moorestown
x86/mrst: Set ISA bus type for fake MP IRQs
x86/ioapic: Use legacy_pic to set correct gsi-irq mapping
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MCE changes from Ingo Molnar.
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Fix return value of mce_chrdev_read() when erst is disabled
x86/mce: Convert static array of pointers to per-cpu variables
x86/mce: Replace hard coded hex constants with symbolic defines
x86/mce: Recognise machine check bank signature for data path error
x86/mce: Handle "action required" errors
x86/mce: Add mechanism to safely save information in MCE handler
x86/mce: Create helper function to save addr/misc when needed
HWPOISON: Add code to handle "action required" errors.
HWPOISON: Clean up memory_failure() vs. __memory_failure()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/fpu changes from Ingo Molnar.
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
i387: Split up <asm/i387.h> into exported and internal interfaces
i387: Uninline the generic FP helpers that we expose to kernel modules
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/eficross (booting 32/64-bit kernel from 64/32-bit EFI) from Ingo Molnar
* 'x86-eficross-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, efi: Allow basic init with mixed 32/64-bit efi/kernel
x86, efi: Add basic error handling
x86, efi: Cleanup config table walking
x86, efi: Convert printk to pr_*()
x86, efi: Refactor efi_init() a bit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/debug changes from Ingo Molnar.
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Fix section warnings
x86-64: Fix CFI data for common_interrupt()
x86: Properly _init-annotate NMI selftest code
x86/debug: Fix/improve the show_msr=<cpus> debug print out
|
|
'x86-cpufeature-for-linus', 'x86-process-for-linus' and 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull trivial x86 branches from Ingo Molnar: small one-liners to fix up
details.
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Remove some noise from boot log when starting cpus
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, boot: Fix port argument to inl() function
* 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, cpufeature: Add CPU features from Intel document 319433-012A
* 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86_64: Record stack pointer before task execution begins
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/UV: Lower UV rtc clocksource rating
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/atomic changes from Ingo Molnar.
* 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: atomic64 assembly improvements
x86: Adjust asm constraints in atomic64 wrappers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asm changes from Ingo Molnar
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Include probe_roms.h in probe_roms.c
x86/32: Print control and debug registers for kerenel context
x86: Tighten dependencies of CPU_SUP_*_32
x86/numa: Improve internode cache alignment
x86: Fix the NMI nesting comments
x86-64: Improve insn scheduling in SAVE_ARGS_IRQ
x86-64: Fix CFI annotations for NMI nesting code
bitops: Add missing parentheses to new get_order macro
bitops: Optimise get_order()
bitops: Adjust the comment on get_order() to describe the size==0 case
x86/spinlocks: Eliminate TICKET_MASK
x86-64: Handle byte-wise tail copying in memcpy() without a loop
x86-64: Fix memcpy() to support sizes of 4Gb and above
x86-64: Fix memset() to support sizes of 4Gb and above
x86-64: Slightly shorten copy_page()
|
|
Pull irq_domain support for all architectures from Grant Likely:
"Generialize powerpc's irq_host as irq_domain
This branch takes the PowerPC irq_host infrastructure (reverse mapping
from Linux IRQ numbers to hardware irq numbering), generalizes it,
renames it to irq_domain, and makes it available to all architectures.
Originally the plan has been to create an all-new irq_domain
implementation which addresses some of the powerpc shortcomings such
as not handling 1:1 mappings well, but doing that proved to be far
more difficult and invasive than generalizing the working code and
refactoring it in-place. So, this branch rips out the 'new'
irq_domain and replaces it with the modified powerpc version (in a
fully bisectable way of course). It converts all users over to the
new API and makes irq_domain selectable on any architecture.
No architecture is forced to enable irq_domain, but the infrastructure
is required for doing OpenFirmware style irq translations. It will
even work on SPARC even though SPARC has it's own mechanism for
translating irqs at boot time. MIPS, microblaze, embedded x86 and c6x
are converted too.
The resulting irq_domain code is probably still too verbose and can be
optimized more, but that can be done incrementally and is a task for
follow-on patches."
* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6: (31 commits)
dt: fix twl4030 for non-dt compile on x86
mfd: twl-core: Add IRQ_DOMAIN dependency
devicetree: Add empty of_platform_populate() for !CONFIG_OF_ADDRESS (sparc)
irq_domain: Centralize definition of irq_dispose_mapping()
irq_domain/mips: Allow irq_domain on MIPS
irq_domain/x86: Convert x86 (embedded) to use common irq_domain
ppc-6xx: fix build failure in flipper-pic.c and hlwd-pic.c
irq_domain/microblaze: Convert microblaze to use irq_domains
irq_domain/powerpc: Replace custom xlate functions with library functions
irq_domain/powerpc: constify irq_domain_ops
irq_domain/c6x: Use library of xlate functions
irq_domain/c6x: constify irq_domain structures
irq_domain/c6x: Convert c6x to use generic irq_domain support.
irq_domain: constify irq_domain_ops
irq_domain: Create common xlate functions that device drivers can use
irq_domain: Remove irq_domain_add_simple()
irq_domain: Remove 'new' irq_domain in favour of the ppc one
mfd: twl-core.c: Fix the number of interrupts managed by twl4030
of/address: add empty static inlines for !CONFIG_OF
irq_domain: Add support for base irq and hwirq in legacy mappings
...
|
|
Pull kmap_atomic cleanup from Cong Wang.
It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().
Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.
* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
drbd: remove the second argument of k[un]map_atomic()
zcache: remove the second argument of k[un]map_atomic()
gma500: remove the second argument of k[un]map_atomic()
dm: remove the second argument of k[un]map_atomic()
tomoyo: remove the second argument of k[un]map_atomic()
sunrpc: remove the second argument of k[un]map_atomic()
rds: remove the second argument of k[un]map_atomic()
net: remove the second argument of k[un]map_atomic()
mm: remove the second argument of k[un]map_atomic()
lib: remove the second argument of k[un]map_atomic()
power: remove the second argument of k[un]map_atomic()
kdb: remove the second argument of k[un]map_atomic()
udf: remove the second argument of k[un]map_atomic()
ubifs: remove the second argument of k[un]map_atomic()
squashfs: remove the second argument of k[un]map_atomic()
reiserfs: remove the second argument of k[un]map_atomic()
ocfs2: remove the second argument of k[un]map_atomic()
ntfs: remove the second argument of k[un]map_atomic()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core patches for 3.4-rc1 from Greg KH:
"Here's the big driver core merge for 3.4-rc1.
Lots of various things here, sysfs fixes/tweaks (with the nlink
breakage reverted), dynamic debugging updates, w1 drivers, hyperv
driver updates, and a variety of other bits and pieces, full
information in the shortlog."
* tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (78 commits)
Tools: hv: Support enumeration from all the pools
Tools: hv: Fully support the new KVP verbs in the user level daemon
Drivers: hv: Support the newly introduced KVP messages in the driver
Drivers: hv: Add new message types to enhance KVP
regulator: Support driver probe deferral
Revert "sysfs: Kill nlink counting."
uevent: send events in correct order according to seqnum (v3)
driver core: minor comment formatting cleanups
driver core: move the deferred probe pointer into the private area
drivercore: Add driver probe deferral mechanism
DS2781 Maxim Stand-Alone Fuel Gauge battery and w1 slave drivers
w1_bq27000: Only one thread can access the bq27000 at a time.
w1_bq27000 - remove w1_bq27000_write
w1_bq27000: remove unnecessary NULL test.
sysfs: Fix memory leak in sysfs_sd_setsecdata().
intel_idle: Revert change of auto_demotion_disable_flags for Nehalem
w1: Fix w1_bq27000
driver-core: documentation: fix up Greg's email address
powernow-k6: Really enable auto-loading
powernow-k7: Fix CPU family number
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer changes for v3.4 from Ingo Molnar
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
ntp: Fix integer overflow when setting time
math: Introduce div64_long
cs5535-clockevt: Allow the MFGPT IRQ to be shared
cs5535-clockevt: Don't ignore MFGPT on SMP-capable kernels
x86/time: Eliminate unused irq0_irqs counter
clocksource: scx200_hrt: Fix the build
x86/tsc: Reduce the TSC sync check time for core-siblings
timer: Fix bad idle check on irq entry
nohz: Remove ts->Einidle checks before restarting the tick
nohz: Remove update_ts_time_stat from tick_nohz_start_idle
clockevents: Leave the broadcast device in shutdown mode when not needed
clocksource: Load the ACPI PM clocksource asynchronously
clocksource: scx200_hrt: Convert scx200 to use clocksource_register_hz
clocksource: Get rid of clocksource_calc_mult_shift()
clocksource: dbx500: convert to clocksource_register_hz()
clocksource: scx200_hrt: use pr_<level> instead of printk
time: Move common updates to a function
time: Reorder so the hot data is together
time: Remove most of xtime_lock usage in timekeeping.c
ntp: Add ntp_lock to replace xtime_locking
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes for v3.4 from Ingo Molnar
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
printk: Make it compile with !CONFIG_PRINTK
sched/x86: Fix overflow in cyc2ns_offset
sched: Fix nohz load accounting -- again!
sched: Update yield() docs
printk/sched: Introduce special printk_sched() for those awkward moments
sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer
sched: Cleanup cpu_active madness
sched: Fix load-balance wreckage
sched: Clean up parameter passing of proc_sched_autogroup_set_nice()
sched: Ditch per cgroup task lists for load-balancing
sched: Rename load-balancing fields
sched: Move load-balancing arguments into helper struct
sched/rt: Do not submit new work when PI-blocked
sched/rt: Prevent idle task boosting
sched/wait: Add __wake_up_all_locked() API
sched/rt: Document scheduler related skip-resched-check sites
sched/rt: Use schedule_preempt_disabled()
sched/rt: Add schedule_preempt_disabled()
sched/rt: Do not throttle when PI boosting
sched/rt: Keep period timer ticking when rt throttling is active
...
|
|
[swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
|
|
Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.
Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
Also move restore_sched_clock_state before __restore_processor_state,
since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
Thanks to Igor Mammedov for tracking it down.
Fixes suspend-to-disk with kvmclock.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Move APIC ID validity check into platform APIC code, so it can
be overridden when needed. For NumaChip systems, always trust
MADT, as it's constructed with high APIC IDs.
Behaviour verifies on standard x86 systems and on NumaChip
systems with this, and compile-tested with allyesconfig.
Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com>
Reviewed-by: Steffen Persvold <sp@numascale.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1331709454-27966-1-git-send-email-daniel@numascale-asia.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Merge reason: Update to the almost-final v3.3 kernel.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce
Apply two miscellaneous MCE fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
When a machine boots up, the TSC generally gets reset. However,
when kexec is used to boot into a kernel, the TSC value would be
carried over from the previous kernel. The computation of
cycns_offset in set_cyc2ns_scale is prone to an overflow, if the
machine has been up more than 208 days prior to the kexec. The
overflow happens when we multiply *scale, even though there is
enough room to store the final answer.
We fix this issue by decomposing tsc_now into the quotient and
remainder of division by CYC2NS_SCALE_FACTOR and then performing
the multiplication separately on the two components.
Refactor code to share the calculation with the previous
fix in __cycles_2_ns().
Signed-off-by: Salman Qazi <sqazi@google.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Turner <pjt@google.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
For the hypervisor to take advantage of the MWAIT support it needs
to extract from the ACPI _CST the register address. But the
hypervisor does not have the support to parse DSDT so it relies on
the initial domain (dom0) to parse the ACPI Power Management information
and push it up to the hypervisor. The pushing of the data is done
by the processor_harveset_xen module which parses the information that
the ACPI parser has graciously exposed in 'struct acpi_processor'.
For the ACPI parser to also expose the Cx states for MWAIT, we need
to expose the MWAIT capability (leaf 1). Furthermore we also need to
expose the MWAIT_LEAF capability (leaf 5) for cstate.c to properly
function.
The hypervisor could expose these flags when it traps the XEN_EMULATE_PREFIX
operations, but it can't do it since it needs to be backwards compatible.
Instead we choose to use the native CPUID to figure out if the MWAIT
capability exists and use the XEN_SET_PDC query hypercall to figure out
if the hypervisor wants us to expose the MWAIT_LEAF capability or not.
Note: The XEN_SET_PDC query was implemented in c/s 23783:
"ACPI: add _PDC input override mechanism".
With this in place, instead of
C3 ACPI IOPORT 415
we get now
C3:ACPI FFH INTEL MWAIT 0x20
Note: The cpu_idle which would be calling the mwait variants for idling
never gets set b/c we set the default pm_idle to be the hypercall variant.
Acked-by: Jan Beulich <JBeulich@suse.com>
[v2: Fix missing header file include and #ifdef]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
This was done to resolve a conflict in the drivers/base/cpu.c file.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Print warning once if pin control bit is set in eventsel msr since
emulation does not support it yet.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Task switches can switch between Protected Mode and VM86. The current
mode must be updated during the task switch emulation so that the new
segment selectors are interpreted correctly.
In order to let privilege checks succeed, rflags needs to be updated in
the vcpu struct as this causes a CPL update.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Currently, all task switches check privileges against the DPL of the
TSS. This is only correct for jmp/call to a TSS. If a task gate is used,
the DPL of this take gate is used for the check instead. Exceptions,
external interrupts and iret shouldn't perform any check.
[avi: kill kvm-kmod remnants]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Some members of kvm_memory_slot are not used by every architecture.
This patch is the first step to make this difference clear by
introducing kvm_memory_slot::arch; lpage_info is moved into it.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
This allows us to track the original nanosecond and counter values
at each phase of TSC writing by the guest. This gets us perfect
offset matching for stable TSC systems, and perfect software
computed TSC matching for machines with unstable TSC.
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
During a host suspend, TSC may go backwards, which KVM interprets
as an unstable TSC. Technically, KVM should not be marking the
TSC unstable, which causes the TSC clocksource to go bad, but we
need to be adjusting the TSC offsets in such a case.
Dealing with this issue is a little tricky as the only place we
can reliably do it is before much of the timekeeping infrastructure
is up and running. On top of this, we are not in a KVM thread
context, so we may not be able to safely access VCPU fields.
Instead, we compute our best known hardware offset at power-up and
stash it to be applied to all VCPUs when they actually start running.
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Redefine the API to take a parameter indicating whether an
adjustment is in host or guest cycles.
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The variable last_host_tsc was removed from upstream code. I am adding
it back for two reasons. First, it is unnecessary to use guest TSC
computation to conclude information about the host TSC. The guest may
set the TSC backwards (this case handled by the previous patch), but
the computation of guest TSC (and fetching an MSR) is significanlty more
work and complexity than simply reading the hardware counter. In addition,
we don't actually need the guest TSC for any part of the computation,
by always recomputing the offset, we can eliminate the need to deal with
the current offset and any scaling factors that may apply.
The second reason is that later on, we are going to be using the host
TSC value to restore TSC offsets after a host S4 suspend, so we need to
be reading the host values, not the guest values here.
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
This patch adds the LBR definitions for NHM/WSM/SNB and Core.
It also adds the definitions for the architected LBR MSR:
LBR_SELECT, LBRT_TOS.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Increase recommended max vcpus from 64 to 160 (tested internally
at Red Hat).
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
When kvm guest uses kvmclock, it may hang on vcpu hot-plug.
This is caused by an overflow in pvclock_get_nsec_offset,
u64 delta = tsc - shadow->tsc_timestamp;
which in turn is caused by an undefined values from percpu
hv_clock that hasn't been initialized yet.
Uninitialized clock on being booted cpu is accessed from
start_secondary
-> smp_callin
-> smp_store_cpu_info
-> identify_secondary_cpu
-> mtrr_ap_init
-> mtrr_restore
-> stop_machine_from_inactive_cpu
-> queue_stop_cpus_work
...
-> sched_clock
-> kvm_clock_read
which is well before x86_cpuinit.setup_percpu_clockev call in
start_secondary, where percpu clock is initialized.
This patch introduces a hook that allows to setup/initialize
per_cpu clock early and avoid overflow due to reading
- undefined values
- old values if cpu was offlined and then onlined again
Another possible early user of this clock source is ftrace that
accesses it to get timestamps for ring buffer entries. So if
mtrr_ap_init is moved from identify_secondary_cpu to past
x86_cpuinit.setup_percpu_clockev in start_secondary, ftrace
may cause the same overflow/hang on cpu hot-plug anyway.
More complete description of the problem:
https://lkml.org/lkml/2012/2/2/101
Credits to Marcelo Tosatti <mtosatti@redhat.com> for hook idea.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
On some cpus the overhead for virtualization instructions is in the same
range as a system call. Having to call multiple ioctls to get set registers
will make certain userspace handled exits more expensive than necessary.
Lets provide a section in kvm_run that works as a shared save area
for guest registers.
We also provide two 64bit flags fields (architecture specific), that will
specify
1. which parts of these fields are valid.
2. which registers were modified by userspace
Each bit for these flag fields will define a group of registers (like
general purpose) or a single register.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
In some cases guests should not provide workarounds for errata even when the
physical processor is affected. For example, because of erratum 400 on family
10h processors a Linux guest will read an MSR (resulting in VMEXIT) before
going to idle in order to avoid getting stuck in a non-C0 state. This is not
necessary: HLT and IO instructions are intercepted and therefore there is no
reason for erratum 400 workaround in the guest.
This patch allows us to present a guest with certain errata as fixed,
regardless of the state of actual hardware.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Conflicts:
tools/perf/builtin-record.c
tools/perf/builtin-top.c
tools/perf/perf.h
tools/perf/util/top.h
Merge reason: resolve these cherry-picking conflicts.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
other BUG variant in a static inline (i.e. not in a #define) then
that header really should be including <linux/bug.h> and not just
expecting it to be implicitly present.
We can make this change risk-free, since if the files using these
headers didn't have exposure to linux/bug.h already, they would have
been causing compile failures/warnings.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
It turned out that a performance counter on AMD does not
count at all when the GO or HO bit is set in the control
register and SVM is disabled in EFER.
This patch works around this issue by masking out the HO bit
in the performance counter control register when SVM is not
enabled.
The GO bit is not touched because it is only set when the
user wants to count in guest-mode only. So when SVM is
disabled the counter should not run at all and the
not-counting is the intended behaviour.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Avi Kivity <avi@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: stable@vger.kernel.org # v3.2
Link: http://lkml.kernel.org/r/1330523852-19566-1-git-send-email-joerg.roedel@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Since we already have a debugreg.h header file, move the
assoc. get/set functions to it. In addition to it being the
logical home for them, it has a secondary advantage. The
functions that are moved use BUG(). So we really need to
have linux/bug.h in scope. But asm/processor.h is used about
600 times, vs. only about 15 for debugreg.h -- so adding bug.h
to the latter reduces the amount of time we'll be processing
it during a compile.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
|