summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2016-02-25m32r: fix m32104ut_defconfig build failSudip Mukherjee
commit 601f1db653217f205ffa5fb33514b4e1711e56d1 upstream. The build of m32104ut_defconfig for m32r arch was failing for long long time with the error: ERROR: "memory_start" [fs/udf/udf.ko] undefined! ERROR: "memory_end" [fs/udf/udf.ko] undefined! ERROR: "memory_end" [drivers/scsi/sg.ko] undefined! ERROR: "memory_start" [drivers/scsi/sg.ko] undefined! ERROR: "memory_end" [drivers/i2c/i2c-dev.ko] undefined! ERROR: "memory_start" [drivers/i2c/i2c-dev.ko] undefined! As done in other architectures export the symbols to fix the error. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-24ARM: 8517/1: ICST: avoid arithmetic overflow in icst_hz()Linus Walleij
commit 5070fb14a0154f075c8b418e5bc58a620ae85a45 upstream. When trying to set the ICST 307 clock to 25174000 Hz I ran into this arithmetic error: the icst_hz_to_vco() correctly figure out DIVIDE=2, RDW=100 and VDW=99 yielding a frequency of 25174000 Hz out of the VCO. (I replicated the icst_hz() function in a spreadsheet to verify this.) However, when I called icst_hz() on these VCO settings it would instead return 4122709 Hz. This causes an error in the common clock driver for ICST as the common clock framework will call .round_rate() on the clock which will utilize icst_hz_to_vco() followed by icst_hz() suggesting the erroneous frequency, and then the clock gets set to this. The error did not manifest in the old clock framework since this high frequency was only used by the CLCD, which calls clk_set_rate() without first calling clk_round_rate() and since the old clock framework would not call clk_round_rate() before setting the frequency, the correct values propagated into the VCO. After some experimenting I figured out that it was due to a simple arithmetic overflow: the divisor for 24Mhz reference frequency as reference becomes 24000000*2*(99+8)=0x132212400 and the "1" in bit 32 overflows and is lost. But introducing an explicit 64-by-32 bit do_div() and casting the divisor into (u64) we get the right frequency back, and the right frequency gets set. Tested on the ARM Versatile. Cc: linux-clk@vger.kernel.org Cc: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-24ARM: 8519/1: ICST: try other dividends than 1Linus Walleij
commit e972c37459c813190461dabfeaac228e00aae259 upstream. Since the dawn of time the ICST code has only supported divide by one or hang in an eternal loop. Luckily we were always dividing by one because the reference frequency for the systems using the ICSTs is 24MHz and the [min,max] values for the PLL input if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop will always terminate immediately without assigning any divisor for the reference frequency. But for the code to make sense, let's insert the missing i++ Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-24ARM: 8471/1: need to save/restore arm register(r11) when it is corruptedAnson Huang
commit fa0708b320f6da4c1104fe56e01b7abf66fd16ad upstream. In cpu_v7_do_suspend routine, r11 is used while it is NOT saved/restored, different compiler may have different usage of ARM general registers, so it may cause issues during calling cpu_v7_do_suspend. We meet kernel fault occurs when using GCC 4.8.3, r11 contains valid value before calling into cpu_v7_do_suspend, but when returned from this routine, r11 is corrupted and lead to kernel fault. Doing save/restore for those corrupted registers is a must in assemble code. Signed-off-by: Anson Huang <Anson.Huang@freescale.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-24ARM: dts: Kirkwood: Fix QNAP TS219 power-offHelmut Klein
commit 5442f0eadf2885453d5b2ed8c8592f32a3744f8e upstream. The "reg" entry in the "poweroff" section of "kirkwood-ts219.dtsi" addressed the wrong uart (0 = console). This patch changes the address to select uart 1, which is the uart connected to the pic microcontroller, which can switch the device off. Signed-off-by: Helmut Klein <hgkr.klein@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: 4350a47bbac3 ("ARM: Kirkwood: Make use of the QNAP Power off driver.") Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-24x86/mm/pat: Avoid truncation when converting cpa->numpages to addressMatt Fleming
commit 742563777e8da62197d6cb4b99f4027f59454735 upstream. There are a couple of nasty truncation bugs lurking in the pageattr code that can be triggered when mapping EFI regions, e.g. when we pass a cpa->pgd pointer. Because cpa->numpages is a 32-bit value, shifting left by PAGE_SHIFT will truncate the resultant address to 32-bits. Viorel-Cătălin managed to trigger this bug on his Dell machine that provides a ~5GB EFI region which requires 1236992 pages to be mapped. When calling populate_pud() the end of the region gets calculated incorrectly in the following buggy expression, end = start + (cpa->numpages << PAGE_SHIFT); And only 188416 pages are mapped. Next, populate_pud() gets invoked for a second time because of the loop in __change_page_attr_set_clr(), only this time no pages get mapped because shifting the remaining number of pages (1048576) by PAGE_SHIFT is zero. At which point the loop in __change_page_attr_set_clr() spins forever because we fail to map progress. Hitting this bug depends very much on the virtual address we pick to map the large region at and how many pages we map on the initial run through the loop. This explains why this issue was only recently hit with the introduction of commit a5caa209ba9c ("x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down") It's interesting to note that safe uses of cpa->numpages do exist in the pageattr code. If instead of shifting ->numpages we multiply by PAGE_SIZE, no truncation occurs because PAGE_SIZE is a UL value, and so the result is unsigned long. To avoid surprises when users try to convert very large cpa->numpages values to addresses, change the data type from 'int' to 'unsigned long', thereby making it suitable for shifting by PAGE_SHIFT without any type casting. The alternative would be to make liberal use of casting, but that is far more likely to cause problems in the future when someone adds more code and fails to cast properly; this bug was difficult enough to track down in the first place. Reported-and-tested-by: Viorel-Cătălin Răpițeanu <rapiteanu.catalin@gmail.com> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Link: https://bugzilla.kernel.org/show_bug.cgi?id=110131 Link: http://lkml.kernel.org/r/1454067370-10374-1-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-24s390: fix normalization bug in exception table sortingArd Biesheuvel
commit bcb7825a77f41c7dd91da6f7ac10b928156a322e upstream. The normalization pass in the sorting routine of the relative exception table serves two purposes: - it ensures that the address fields of the exception table entries are fully ordered, so that no ambiguities arise between entries with identical instruction offsets (i.e., when two instructions that are exactly 8 bytes apart each have an exception table entry associated with them) - it ensures that the offsets of both the instruction and the fixup fields of each entry are relative to their final location after sorting. Commit eb608fb366de ("s390/exceptions: switch to relative exception table entries") ported the relative exception table format from x86, but modified the sorting routine to only normalize the instruction offset field and not the fixup offset field. The result is that the fixup offset of each entry will be relative to the original location of the entry before sorting, likely leading to crashes when those entries are dereferenced. Fixes: eb608fb366de ("s390/exceptions: switch to relative exception table entries") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-15parisc: Fix __ARCH_SI_PREAMBLE_SIZEHelge Deller
commit e60fc5aa608eb38b47ba4ee058f306f739eb70a0 upstream. On a 64bit kernel build the compiler aligns the _sifields union in the struct siginfo_t on a 64bit address. The __ARCH_SI_PREAMBLE_SIZE define compensates for this alignment and thus fixes the wait testcase of the strace package. The symptoms of a wrong __ARCH_SI_PREAMBLE_SIZE value is that _sigchld.si_stime variable is missed to be copied and thus after a copy_siginfo() will have uninitialized values. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-15parisc: Fix syscall restartsHelge Deller
commit 71a71fb5374a23be36a91981b5614590b9e722c3 upstream. On parisc syscalls which are interrupted by signals sometimes failed to restart and instead returned -ENOSYS which in the worst case lead to userspace crashes. A similiar problem existed on MIPS and was fixed by commit e967ef02 ("MIPS: Fix restart of indirect syscalls"). On parisc the current syscall restart code assumes that all syscall callers load the syscall number in the delay slot of the ble instruction. That's how it is e.g. done in the unistd.h header file: ble 0x100(%sr2, %r0) ldi #syscall_nr, %r20 Because of that assumption the current code never restored %r20 before returning to userspace. This assumption is at least not true for code which uses the glibc syscall() function, which instead uses this syntax: ble 0x100(%sr2, %r0) copy regX, %r20 where regX depend on how the compiler optimizes the code and register usage. This patch fixes this problem by adding code to analyze how the syscall number is loaded in the delay branch and - if needed - copy the syscall number to regX prior returning to userspace for the syscall restart. Signed-off-by: Helge Deller <deller@gmx.de> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-15parisc: Drop unused MADV_xxxK_PAGES flags from asm/mman.hHelge Deller
commit dcbf0d299c00ed4f82ea8d6e359ad88a5182f9b8 upstream. Drop the MADV_xxK_PAGES flags, which were never used and were from a proposed API which was never integrated into the generic Linux kernel code. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-15sh64: fix __NR_fgetxattrDmitry V. Levin
commit 2d33fa1059da4c8e816627a688d950b613ec0474 upstream. According to arch/sh/kernel/syscalls_64.S and common sense, __NR_fgetxattr has to be defined to 259, but it doesn't. Instead, it's defined to 269, which is of course used by another syscall, __NR_sched_setaffinity in this case. This bug was found by strace test suite. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12openrisc: fix CONFIG_UID16 settingAndrew Morton
commit 04ea1e91f85615318ea91ce8ab50cb6a01ee4005 upstream. openrisc-allnoconfig: kernel/uid16.c: In function 'SYSC_setgroups16': kernel/uid16.c:184:2: error: implicit declaration of function 'groups_alloc' kernel/uid16.c:184:13: warning: assignment makes pointer from integer without a cast openrisc shouldn't be setting CONFIG_UID16 when CONFIG_MULTIUSER=n. Fixes: 2813893f8b197a1 ("kernel: conditionally support non-root users, groups and capabilities") Reported-by: Fengguang Wu <fengguang.wu@gmail.com> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12x86: vvar, fix excessive gcc-6 DECLARE_VVAR warningsJiri Slaby
On 3.12, with gcc-6, I see a lot of: arch/x86/include/asm/vvar.h:33:28: warning: ‘vvaraddr_jiffies’ defined but not used [-Wunused-const-variable] static type const * const vvaraddr_ ## name = \ ^ arch/x86/include/asm/vvar.h:46:1: note: in expansion of macro ‘DECLARE_VVAR’ DECLARE_VVAR(0, volatile unsigned long, jiffies) ^~~~~~~~~~~~ In upstream, this is fixed by ef721987ae (x86, vdso: Introduce VVAR marco for vdso32) and f40c330091 (x86, vdso: Move the vvar and hpet mappings next to the 64-bit vDSO). But this is not applicable to stable. So mark the vvar declaration as __maybe_unused and be done with it. This will generate it to the code only if it is used. I.e. the same as with gcc < 6. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Andy Lutomirski <luto@amacapital.net>
2016-02-12arm64: restore bogomips information in /proc/cpuinfoYang Shi
commit 92e788b749862ebe9920360513a718e5dd4da7a9 upstream. As previously reported, some userspace applications depend on bogomips showed by /proc/cpuinfo. Although there is much less legacy impact on aarch64 than arm, it does break libvirt. This patch reverts commit 326b16db9f69 ("arm64: delay: don't bother reporting bogomips in /proc/cpuinfo"), but with some tweak due to context change and without the pr_info(). Fixes: 326b16db9f69 ("arm64: delay: don't bother reporting bogomips in /proc/cpuinfo") Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12mn10300: Select CONFIG_HAVE_UID16 to fix build failureGuenter Roeck
commit c86576ea114a9a881cf7328dc7181052070ca311 upstream. mn10300 builds fail with fs/stat.c: In function 'cp_old_stat': fs/stat.c:163:2: error: 'old_uid_t' undeclared ipc/util.c: In function 'ipc64_perm_to_ipc_perm': ipc/util.c:540:2: error: 'old_uid_t' undeclared Select CONFIG_HAVE_UID16 and remove local definition of CONFIG_UID16 to fix the problem. Fixes: fbc416ff8618 ("arm64: fix building without CONFIG_UID16") Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12arm64: mm: ensure that the zero page is visible to the page table walkerWill Deacon
commit 32d6397805d00573ce1fa55f408ce2bca15b0ad3 upstream. In paging_init, we allocate the zero page, memset it to zero and then point TTBR0 to it in order to avoid speculative fetches through the identity mapping. In order to guarantee that the freshly zeroed page is indeed visible to the page table walker, we need to execute a dsb instruction prior to writing the TTBR. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-12arm64: Clear out any singlestep state on a ptrace detach operationJohn Blackwood
commit 5db4fd8c52810bd9740c1240ebf89223b171aa70 upstream. Make sure to clear out any ptrace singlestep state when a ptrace(2) PTRACE_DETACH call is made on arm64 systems. Otherwise, the previously ptraced task will die off with a SIGTRAP signal if the debugger just previously singlestepped the ptraced task. Signed-off-by: John Blackwood <john.blackwood@ccur.com> [will: added comment to justify why this is in the arch code] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12arm64: KVM: Fix AArch32 to AArch64 register mappingMarc Zyngier
commit c0f0963464c24e034b858441205455bf2a5d93ad upstream. When running a 32bit guest under a 64bit hypervisor, the ARMv8 architecture defines a mapping of the 32bit registers in the 64bit space. This includes banked registers that are being demultiplexed over the 64bit ones. On exceptions caused by an operation involving a 32bit register, the HW exposes the register number in the ESR_EL2 register. It was so far understood that SW had to distinguish between AArch32 and AArch64 accesses (based on the current AArch32 mode and register number). It turns out that I misinterpreted the ARM ARM, and the clue is in D1.20.1: "For some exceptions, the exception syndrome given in the ESR_ELx identifies one or more register numbers from the issued instruction that generated the exception. Where the exception is taken from an Exception level using AArch32 these register numbers give the AArch64 view of the register." Which means that the HW is already giving us the translated version, and that we shouldn't try to interpret it at all (for example, doing an MMIO operation from the IRQ mode using the LR register leads to very unexpected behaviours). The fix is thus not to perform a call to vcpu_reg32() at all from vcpu_reg(), and use whatever register number is supplied directly. The only case we need to find out about the mapping is when we actively generate a register access, which only occurs when injecting a fault in a guest. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12powerpc: Make {cmp}xchg* and their atomic_ versions fully orderedBoqun Feng
commit 81d7a3294de7e9828310bbf986a67246b13fa01e upstream. According to memory-barriers.txt, xchg*, cmpxchg* and their atomic_ versions all need to be fully ordered, however they are now just RELEASE+ACQUIRE, which are not fully ordered. So also replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in __{cmp,}xchg_{u32,u64} respectively to guarantee fully ordered semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg(), as a complement of commit b97021f85517 ("powerpc: Fix atomic_xxx_return barrier semantics") This patch depends on patch "powerpc: Make value-returning atomics fully ordered" for PPC_ATOMIC_ENTRY_BARRIER definition. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12powerpc: Make value-returning atomics fully orderedBoqun Feng
commit 49e9cf3f0c04bf76ffa59242254110309554861d upstream. According to memory-barriers.txt: > Any atomic operation that modifies some state in memory and returns > information about the state (old or new) implies an SMP-conditional > general memory barrier (smp_mb()) on each side of the actual > operation ... Which mean these operations should be fully ordered. However on PPC, PPC_ATOMIC_ENTRY_BARRIER is the barrier before the actual operation, which is currently "lwsync" if SMP=y. The leading "lwsync" can not guarantee fully ordered atomics, according to Paul Mckenney: https://lkml.org/lkml/2015/10/14/970 To fix this, we define PPC_ATOMIC_ENTRY_BARRIER as "sync" to guarantee the fully-ordered semantics. This also makes futex atomics fully ordered, which can avoid possible memory ordering problems if userspace code relies on futex system call for fully ordered semantics. Fixes: b97021f85517 ("powerpc: Fix atomic_xxx_return barrier semantics") Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-02-12powerpc/tm: Block signal return setting invalid MSR stateMichael Neuling
commit d2b9d2a5ad5ef04ff978c9923d19730cb05efd55 upstream. Currently we allow both the MSR T and S bits to be set by userspace on a signal return. Unfortunately this is a reserved configuration and will cause a TM Bad Thing exception if attempted (via rfid). This patch checks for this case in both the 32 and 64 bit signals code. If both T and S are set, we mark the context as invalid. Found using a syscall fuzzer. Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-27ARM: 8160/1: drop warning about return_address not using unwind tablesUwe Kleine-König
commit e16343c47e4276f5ebc77ca16feb5e50ca1918f9 upstream. The warning was introduced in 2009 (commit 4bf1fa5a34aa ([ARM] 5613/1: implement CALLER_ADDRESSx)). The only "problem" here is that CALLER_ADDRESSx for x > 1 returns NULL which doesn't do much harm. The drawback of implementing a fix (i.e. use unwind tables to implement CALLER_ADDRESSx) is that much of the unwinder code would need to be marked as not traceable. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-27ARM: 8158/1: LLVMLinux: use static inline in ARM ftrace.hBehan Webster
commit aeea3592a13bf12861943e44fc48f1f270941f8d upstream. With compilers which follow the C99 standard (like modern versions of gcc and clang), "extern inline" does the wrong thing (emits code for an externally linkable version of the inline function). In this case using static inline and removing the NULL version of return_address in return_address.c does the right thing. Signed-off-by: Behan Webster <behanw@converseincode.com> Reviewed-by: Mark Charlebois <charlebm@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-25x86/boot: Double BOOT_HEAP_SIZE to 64KBH.J. Lu
commit 8c31902cffc4d716450be549c66a67a8a3dd479c upstream. When decompressing kernel image during x86 bootup, malloc memory for ELF program headers may run out of heap space, which leads to system halt. This patch doubles BOOT_HEAP_SIZE to 64KB. Tested with 32-bit kernel which failed to boot without this patch. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-25x86/reboot/quirks: Add iMac10,1 to pci_reboot_dmi_table[]Mario Kleiner
commit 2f0c0b2d96b1205efb14347009748d786c2d9ba5 upstream. Without the reboot=pci method, the iMac 10,1 simply hangs after printing "Restarting system" at the point when it should reboot. This fixes it. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1450466646-26663-1-git-send-email-mario.kleiner.de@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-25KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSRPaul Mackerras
commit c20875a3e638e4a03e099b343ec798edd1af5cc6 upstream. Currently it is possible for userspace (e.g. QEMU) to set a value for the MSR for a guest VCPU which has both of the TS bits set, which is an illegal combination. The result of this is that when we execute a hrfid (hypervisor return from interrupt doubleword) instruction to enter the guest, the CPU will take a TM Bad Thing type of program interrupt (vector 0x700). Now, if PR KVM is configured in the kernel along with HV KVM, we actually handle this without crashing the host or giving hypervisor privilege to the guest; instead what happens is that we deliver a program interrupt to the guest, with SRR0 reflecting the address of the hrfid instruction and SRR1 containing the MSR value at that point. If PR KVM is not configured in the kernel, then we try to run the host's program interrupt handler with the MMU set to the guest context, which almost certainly causes a host crash. This closes the hole by making kvmppc_set_msr_hv() check for the illegal combination and force the TS field to a safe value (00, meaning non-transactional). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-25x86/xen: don't reset vcpu_info on a cancelled suspendOuyang Zhaowei (Charles)
commit 6a1f513776b78c994045287073e55bae44ed9f8c upstream. On a cancelled suspend the vcpu_info location does not change (it's still in the per-cpu area registered by xen_vcpu_setup()). So do not call xen_hvm_init_shared_info() which would make the kernel think its back in the shared info. With the wrong vcpu_info, events cannot be received and the domain will hang after a cancelled suspend. Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-25x86/signal: Fix restart_syscall number for x32 tasksDmitry V. Levin
commit 22eab1108781eff09961ae7001704f7bd8fb1dce upstream. When restarting a syscall with regs->ax == -ERESTART_RESTARTBLOCK, regs->ax is assigned to a restart_syscall number. For x32 tasks, this syscall number must have __X32_SYSCALL_BIT set, otherwise it will be an x86_64 syscall number instead of a valid x32 syscall number. This issue has been there since the introduction of x32. Reported-by: strace/tests/restart_syscall.test Reported-and-tested-by: Elvira Khabirova <lineprinter0@gmail.com> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Cc: Elvira Khabirova <lineprinter0@gmail.com> Link: http://lkml.kernel.org/r/20151130215436.GA25996@altlinux.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-25m68k/mac: Make SCC reset work more reliablyFinn Thain
commit 56931d73697c99ecf7aba6ae86c94d3a2d15d596 upstream. For SCC initialization we cannot assume that the control register is in the correct state to accept a register pointer. So first read from the control register in order to "sync" up. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-25m68k/mm: Check for mm != NULL in do_page_fault() debug codeGeert Uytterhoeven
commit 4e25c0e92f8eaf69bc51d1d523bcb7268e7dd162 upstream. When DEBUG is enabled, do_page_fault() may dereference a NULL pointer, causing recursive bus errors. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-19m32r: fix potential NULL-pointer dereferenceKirill A. Shutemov
commit fecf3743b824ce4eb275ed4a1d6aee9494f6a966 upstream. Add missing check for memory allocation fail. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-19pm: use GFP_ATOMIC when pm core call this functionScott Jiang
commit aefefe92116b776203f95f3249ae61b94f73f170 upstream. We shouldn't sleep in atomic sections. Signed-off-by: Scott Jiang <scott.jiang.linux@gmail.com> Cc: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-09efi: Disable interrupts around EFI calls, not in the epilog/prolog callsIngo Molnar
commit 23a0d4e8fa6d3a1d7fb819f79bcc0a3739c30ba9 upstream. Tapasweni Pathak reported that we do a kmalloc() in efi_call_phys_prolog() on x86-64 while having interrupts disabled, which is a big no-no, as kmalloc() can sleep. Solve this by removing the irq disabling from the prolog/epilog calls around EFI calls: it's unnecessary, as in this stage we are single threaded in the boot thread, and we don't ever execute this from interrupt contexts. Reported-by: Tapasweni Pathak <tapaswenipathak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> [ luis: backported to 3.10: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05x86/setup: Do not reserve crashkernel high memory if low reservation failedBaoquan He
commit eb6db83d105914c246ac5875be76fd4b944833d5 upstream. People reported that when allocating crashkernel memory using the ",high" and ",low" syntax, there were cases where the reservation of the high portion succeeds but the reservation of the low portion fails. Then kexec can load the kdump kernel successfully, but booting the kdump kernel fails as there's no low memory. The low memory allocation for the kdump kernel can fail on large systems for a couple of reasons. For example, the manually specified crashkernel low memory can be too large and thus no adequate memblock region would be found. Therefore, we try to reserve low memory for the crash kernel *after* the high memory portion has been allocated. If that fails, we free crashkernel high memory too and return. The user can then take measures accordingly. Tested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Baoquan He <bhe@redhat.com> [ Massage text. ] Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Joerg Roedel <jroedel@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Young <dyoung@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Chao <chaowang@redhat.com> Cc: jerry_hoemann@hp.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/1445246268-26285-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05arm64: Fix compat register mappingsRobin Murphy
commit 5accd17d0eb523350c9ef754d655e379c9bb93b3 upstream. For reasons not entirely apparent, but now enshrined in history, the architectural mapping of AArch32 banked registers to AArch64 registers actually orders SP_<mode> and LR_<mode> backwards compared to the intuitive r13/r14 order, for all modes except FIQ. Fix the compat_<reg>_<mode> macros accordingly, in the hope of avoiding subtle bugs with KVM and AArch32 guests. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05x86/cpu: Fix SMAP check in PVOPS environmentsAndrew Cooper
commit 581b7f158fe0383b492acd1ce3fb4e99d4e57808 upstream. There appears to be no formal statement of what pv_irq_ops.save_fl() is supposed to return precisely. Native returns the full flags, while lguest and Xen only return the Interrupt Flag, and both have comments by the implementations stating that only the Interrupt Flag is looked at. This may have been true when initially implemented, but no longer is. To make matters worse, the Xen PVOP leaves the upper bits undefined, making the BUG_ON() undefined behaviour. Experimentally, this now trips for 32bit PV guests on Broadwell hardware. The BUG_ON() is consistent for an individual build, but not consistent for all builds. It has also been a sitting timebomb since SMAP support was introduced. Use native_save_fl() instead, which will obtain an accurate view of the AC flag. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Tested-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <lguest@lists.ozlabs.org> Cc: Xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/1433323874-6927-1-git-send-email-andrew.cooper3@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05x86/cpu: Call verify_cpu() after having entered long mode tooBorislav Petkov
commit 04633df0c43d710e5f696b06539c100898678235 upstream. When we get loaded by a 64-bit bootloader, kernel entry point is startup_64 in head_64.S. We don't trust any and all bootloaders because some will fiddle with CPU configuration so we go ahead and massage each CPU into sanity again. For example, some dell BIOSes have this XD disable feature which set IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround for other OSes but Linux sure doesn't need it. A similar thing is present in the Surface 3 firmware - see https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit only on the BSP: # rdmsr -a 0x1a0 400850089 850089 850089 850089 I know, right?! There's not even an off switch in there. So fix all those cases by sanitizing the 64-bit entry point too. For that, make verify_cpu() callable in 64-bit mode also. Requested-and-debugged-by: "H. Peter Anvin" <hpa@zytor.com> Reported-and-tested-by: Bastien Nocera <bugzilla@hadess.net> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1446739076-21303-1-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05x86/setup: Fix low identity map for >= 2GB kernel rangeKrzysztof Mazur
commit 68accac392d859d24adcf1be3a90e41f978bd54c upstream. The commit f5f3497cad8c extended the low identity mapping. However, if the kernel uses more than 2 GB (VMSPLIT_2G_OPT or VMSPLIT_1G memory split), the normal memory mapping is overwritten by the low identity mapping causing a crash. To avoid overwritting, limit the low identity map to cover only memory before kernel range (PAGE_OFFSET). Fixes: f5f3497cad8c "x86/setup: Extend low identity map to cover whole kernel range Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1446815916-22105-1-git-send-email-krzysiek@podlesie.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05x86/setup: Extend low identity map to cover whole kernel rangePaolo Bonzini
commit f5f3497cad8c8416a74b9aaceb127908755d020a upstream. On 32-bit systems, the initial_page_table is reused by efi_call_phys_prolog as an identity map to call SetVirtualAddressMap. efi_call_phys_prolog takes care of converting the current CPU's GDT to a physical address too. For PAE kernels the identity mapping is achieved by aliasing the first PDPE for the kernel memory mapping into the first PDPE of initial_page_table. This makes the EFI stub's trick "just work". However, for non-PAE kernels there is no guarantee that the identity mapping in the initial_page_table extends as far as the GDT; in this case, accesses to the GDT will cause a page fault (which quickly becomes a triple fault). Fix this by copying the kernel mappings from swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at identity mapping. For some reason, this is only reproducible with QEMU's dynamic translation mode, and not for example with KVM. However, even under KVM one can clearly see that the page table is bogus: $ qemu-system-i386 -pflash OVMF.fd -M q35 vmlinuz0 -s -S -daemonize $ gdb (gdb) target remote localhost:1234 (gdb) hb *0x02858f6f Hardware assisted breakpoint 1 at 0x2858f6f (gdb) c Continuing. Breakpoint 1, 0x02858f6f in ?? () (gdb) monitor info registers ... GDT= 0724e000 000000ff IDT= fffbb000 000007ff CR0=0005003b CR2=ff896000 CR3=032b7000 CR4=00000690 ... The page directory is sane: (gdb) x/4wx 0x32b7000 0x32b7000: 0x03398063 0x03399063 0x0339a063 0x0339b063 (gdb) x/4wx 0x3398000 0x3398000: 0x00000163 0x00001163 0x00002163 0x00003163 (gdb) x/4wx 0x3399000 0x3399000: 0x00400003 0x00401003 0x00402003 0x00403003 but our particular page directory entry is empty: (gdb) x/1wx 0x32b7000 + (0x724e000 >> 22) * 4 0x32b7070: 0x00000000 [ It appears that you can skate past this issue if you don't receive any interrupts while the bogus GDT pointer is loaded, or if you avoid reloading the segment registers in general. Andy Lutomirski provides some additional insight: "AFAICT it's entirely permissible for the GDTR and/or LDT descriptor to point to unmapped memory. Any attempt to use them (segment loads, interrupts, IRET, etc) will try to access that memory as if the access came from CPL 0 and, if the access fails, will generate a valid page fault with CR2 pointing into the GDT or LDT." Up until commit 23a0d4e8fa6d ("efi: Disable interrupts around EFI calls, not in the epilog/prolog calls") interrupts were disabled around the prolog and epilog calls, and the functional GDT was re-installed before interrupts were re-enabled. Which explains why no one has hit this issue until now. ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Laszlo Ersek <lersek@redhat.com> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com> [ Updated changelog. ] Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05ARM: common: edma: Fix channel parameter for irq callbacksPeter Ujfalusi
commit 696d8b70c09dd421c4d037fab04341e5b30585cf upstream. In case when the interrupt happened for the second eDMA the channel number was incorrectly passed to the client driver. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05ARM: 8427/1: dma-mapping: add support for offset parameter in dma_mmap()Marek Szyprowski
commit 7e31210349e9e03a9a4dff31ab5f2bc83e8e84f5 upstream. IOMMU-based dma_mmap() implementation lacked proper support for offset parameter used in mmap call (it always assumed that mapping starts from offset zero). This patch adds support for offset parameter to IOMMU-based implementation. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2016-01-05ARM: 8426/1: dma-mapping: add missing range check in dma_mmap()Marek Szyprowski
commit 371f0f085f629fc0f66695f572373ca4445a67ad upstream. dma_mmap() function in IOMMU-based dma-mapping implementation lacked a check for valid range of mmap parameters (offset and buffer size), what might have caused access beyond the allocated buffer. This patch fixes this issue. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-12-14MIPS: KVM: Uninit VCPU in vcpu_create error pathJames Hogan
commit 585bb8f9a5e592f2ce7abbe5ed3112d5438d2754 upstream. If either of the memory allocations in kvm_arch_vcpu_create() fail, the vcpu which has been allocated and kvm_vcpu_init'd doesn't get uninit'd in the error handling path. Add a call to kvm_vcpu_uninit() to fix this. Fixes: 669e846e6c4e ("KVM/MIPS32: MIPS arch specific APIs for KVM") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-12-14MIPS: KVM: Fix CACHE immediate offset sign extensionJames Hogan
commit c5c2a3b998f1ff5a586f9d37e154070b8d550d17 upstream. The immediate field of the CACHE instruction is signed, so ensure that it gets sign extended by casting it to an int16_t rather than just masking the low 16 bits. Fixes: e685c689f3a8 ("KVM/MIPS32: Privileged instruction/target branch emulation.") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-12-14MIPS: KVM: Fix ASID restoration logicJames Hogan
commit 002374f371bd02df864cce1fe85d90dc5b292837 upstream. ASID restoration on guest resume should determine the guest execution mode based on the guest Status register rather than bit 30 of the guest PC. Fix the two places in locore.S that do this, loading the guest status from the cop0 area. Note, this assembly is specific to the trap & emulate implementation of KVM, so it doesn't need to check the supervisor bit as that mode is not implemented in the guest. Fixes: b680f70fc111 ("KVM/MIPS32: Entry point for trampolining to...") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-11-18x86/mm/hotplug: Modify PGD entry when removing memoryYasuaki Ishimatsu
commit 9661d5bcd058fe15b4138a00d96bd36516134543 upstream. When hot-adding/removing memory, sync_global_pgds() is called for synchronizing PGD to PGD entries of all processes MM. But when hot-removing memory, sync_global_pgds() does not work correctly. At first, sync_global_pgds() checks whether target PGD is none or not. And if PGD is none, the PGD is skipped. But when hot-removing memory, PGD may be none since PGD may be cleared by free_pud_table(). So when sync_global_pgds() is called after hot-removing memory, sync_global_pgds() should not skip PGD even if the PGD is none. And sync_global_pgds() must clear PGD entries of all processes MM. Currently sync_global_pgds() does not clear PGD entries of all processes MM when hot-removing memory. So when hot adding memory which is same memory range as removed memory after hot-removing memory, following call traces are shown: kernel BUG at arch/x86/mm/init_64.c:206! ... [<ffffffff815e0c80>] kernel_physical_mapping_init+0x1b2/0x1d2 [<ffffffff815ced94>] init_memory_mapping+0x1d4/0x380 [<ffffffff8104aebd>] arch_add_memory+0x3d/0xd0 [<ffffffff815d03d9>] add_memory+0xb9/0x1b0 [<ffffffff81352415>] acpi_memory_device_add+0x1af/0x28e [<ffffffff81325dc4>] acpi_bus_device_attach+0x8c/0xf0 [<ffffffff813413b9>] acpi_ns_walk_namespace+0xc8/0x17f [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7 [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7 [<ffffffff813418ed>] acpi_walk_namespace+0x95/0xc5 [<ffffffff81326b4c>] acpi_bus_scan+0x9a/0xc2 [<ffffffff81326bff>] acpi_scan_bus_device_check+0x8b/0x12e [<ffffffff81326cb5>] acpi_scan_device_check+0x13/0x15 [<ffffffff81320122>] acpi_os_execute_deferred+0x25/0x32 [<ffffffff8107e02b>] process_one_work+0x17b/0x460 [<ffffffff8107edfb>] worker_thread+0x11b/0x400 [<ffffffff8107ece0>] ? rescuer_thread+0x400/0x400 [<ffffffff81085aef>] kthread+0xcf/0xe0 [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140 [<ffffffff815fc76c>] ret_from_fork+0x7c/0xb0 [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140 This patch clears PGD entries of all processes MM when sync_global_pgds() is called after hot-removing memory Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Vlastimil Babka <vbabka@suse.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-11-18x86/mm/hotplug: Pass sync_global_pgds() a correct argument in remove_pagetable()Yasuaki Ishimatsu
commit 5255e0a79fcc0ff47b387af92bd9ef5729b1b859 upstream. When hot-adding memory after hot-removing memory, following call traces are shown: kernel BUG at arch/x86/mm/init_64.c:206! ... [<ffffffff815e0c80>] kernel_physical_mapping_init+0x1b2/0x1d2 [<ffffffff815ced94>] init_memory_mapping+0x1d4/0x380 [<ffffffff8104aebd>] arch_add_memory+0x3d/0xd0 [<ffffffff815d03d9>] add_memory+0xb9/0x1b0 [<ffffffff81352415>] acpi_memory_device_add+0x1af/0x28e [<ffffffff81325dc4>] acpi_bus_device_attach+0x8c/0xf0 [<ffffffff813413b9>] acpi_ns_walk_namespace+0xc8/0x17f [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7 [<ffffffff81325d38>] ? acpi_bus_type_and_status+0xb7/0xb7 [<ffffffff813418ed>] acpi_walk_namespace+0x95/0xc5 [<ffffffff81326b4c>] acpi_bus_scan+0x9a/0xc2 [<ffffffff81326bff>] acpi_scan_bus_device_check+0x8b/0x12e [<ffffffff81326cb5>] acpi_scan_device_check+0x13/0x15 [<ffffffff81320122>] acpi_os_execute_deferred+0x25/0x32 [<ffffffff8107e02b>] process_one_work+0x17b/0x460 [<ffffffff8107edfb>] worker_thread+0x11b/0x400 [<ffffffff8107ece0>] ? rescuer_thread+0x400/0x400 [<ffffffff81085aef>] kthread+0xcf/0xe0 [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140 [<ffffffff815fc76c>] ret_from_fork+0x7c/0xb0 [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140 The patch-set fixes the issue. This patch (of 2): remove_pagetable() gets start argument and passes the argument to sync_global_pgds(). In this case, the argument must not be modified. If the argument is modified and passed to sync_global_pgds(), sync_global_pgds() does not correctly synchronize PGD to PGD entries of all processes MM since synchronized range of memory [start, end] is wrong. Unfortunately the start argument is modified in remove_pagetable(). So this patch fixes the issue. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Vlastimil Babka <vbabka@suse.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-11-18KVM: x86: Use new is_noncanonical_address in _linearizeNadav Amit
commit 4be4de7ef9fd3a4d77320d4713970299ffecd286 upstream. Replace the current canonical address check with the new function which is identical. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-11-18KVM: x86: Fix far-jump to non-canonical checkNadav Amit
commit 7e46dddd6f6cd5dbf3c7bd04a7e75d19475ac9f2 upstream. Commit d1442d85cc30 ("KVM: x86: Handle errors when RIP is set during far jumps") introduced a bug that caused the fix to be incomplete. Due to incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may not trigger #GP. As we know, this imposes a security problem. In addition, the condition for two warnings was incorrect. Fixes: d1442d85cc30ea75f7d399474ca738e0bc96f715 Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> [Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2015-11-18KVM: svm: unconditionally intercept #DBPaolo Bonzini
commit cbdb967af3d54993f5814f1cee0ed311a055377d upstream. This is needed to avoid the possibility that the guest triggers an infinite stream of #DB exceptions (CVE-2015-8104). VMX is not affected: because it does not save DR6 in the VMCS, it already intercepts #DB unconditionally. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>