Age | Commit message (Collapse) | Author |
|
commit af91568e762d04931dcbdd6bef4655433d8b9418 upstream.
The uncore_collect_events functions assumes that event group
might contain only uncore events which is wrong, because it
might contain any type of events.
This bug leads to uncore framework touching 'not' uncore events,
which could end up all sorts of bugs.
One was triggered by Vince's perf fuzzer, when the uncore code
touched breakpoint event private event space as if it was uncore
event and caused BUG:
BUG: unable to handle kernel paging request at ffffffff82822068
IP: [<ffffffff81020338>] uncore_assign_events+0x188/0x250
...
The code in uncore_assign_events() function was looking for
event->hw.idx data while the event was initialized as a
breakpoint with different members in event->hw union.
This patch forces uncore_collect_events() to collect only uncore
events.
Reported-by: Vince Weaver <vince@deater.net>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1418243031-20367-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit e55355453600a33bb5ca4f71f2d7214875f3b061 upstream.
Enabling the hardware I/O coherency on Armada 370, Armada 375, Armada
38x and Armada XP requires a certain number of conditions:
- On Armada 370, the cache policy must be set to write-allocate.
- On Armada 375, 38x and XP, the cache policy must be set to
write-allocate, the pages must be mapped with the shareable
attribute, and the SMP bit must be set
Currently, on Armada XP, when CONFIG_SMP is enabled, those conditions
are met. However, when Armada XP is used in a !CONFIG_SMP kernel, none
of these conditions are met. With Armada 370, the situation is worse:
since the processor is single core, regardless of whether CONFIG_SMP
or !CONFIG_SMP is used, the cache policy will be set to write-back by
the kernel and not write-allocate.
Since solving this problem turns out to be quite complicated, and we
don't want to let users with a mainline kernel known to have
infrequent but existing data corruptions, this commit proposes to
simply disable hardware I/O coherency in situations where it is known
not to work.
And basically, the is_smp() function of the kernel tells us whether it
is OK to enable hardware I/O coherency or not, so this commit slightly
refactors the coherency_type() function to return
COHERENCY_FABRIC_TYPE_NONE when is_smp() is false, or the appropriate
type of the coherency fabric in the other case.
Thanks to this, the I/O coherency fabric will no longer be used at all
in !CONFIG_SMP configurations. It will continue to be used in
CONFIG_SMP configurations on Armada XP, Armada 375 and Armada 38x
(which are multiple cores processors), but will no longer be used on
Armada 370 (which is a single core processor).
In the process, it simplifies the implementation of the
coherency_type() function, and adds a missing call to of_node_put().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: e60304f8cb7bb545e79fe62d9b9762460c254ec2 ("arm: mvebu: Add hardware I/O Coherency support")
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Link: https://lkml.kernel.org/r/1415871540-20302-3-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 4bf9636c39ac70da091d5a2e28d3448eaa7f115c upstream.
Commit 9fc2105aeaaf ("ARM: 7830/1: delay: don't bother reporting
bogomips in /proc/cpuinfo") breaks audio in python, and probably
elsewhere, with message
FATAL: cannot locate cpu MHz in /proc/cpuinfo
I'm not the first one to hit it, see for example
https://theredblacktree.wordpress.com/2014/08/10/fatal-cannot-locate-cpu-mhz-in-proccpuinfo/
https://devtalk.nvidia.com/default/topic/765800/workaround-for-fatal-cannot-locate-cpu-mhz-in-proc-cpuinf/?offset=1
Reading original changelog, I have to say "Stop breaking working setups.
You know who you are!".
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
omap4_init_static_deps
commit 9008d83fe9dc2e0f19b8ba17a423b3759d8e0fd7 upstream.
Commit 705814b5ea6f ("ARM: OMAP4+: PM: Consolidate OMAP4 PM code to
re-use it for OMAP5")
Moved logic generic for OMAP5+ as part of the init routine by
introducing omap4_pm_init. However, the patch left the powerdomain
initial setup, an unused omap4430 es1.0 check and a spurious log
"Power Management for TI OMAP4." in the original code.
Remove the duplicate code which is already present in omap4_pm_init from
omap4_init_static_deps.
As part of this change, also move the u-boot version print out of the
static dependency function to the omap4_pm_init function.
Fixes: 705814b5ea6f ("ARM: OMAP4+: PM: Consolidate OMAP4 PM code to re-use it for OMAP5")
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 007487f1fd43d84f26cda926081ca219a24ecbc4 upstream.
Currently we enable Exynos devices in the multi v7 defconfig, however, when
testing on my ODROID-U3, I noticed that USB was not working. Enabling this
option causes USB to work, which enables networking support as well since the
ODROID-U3 has networking on the USB bus.
[arnd] Support for odroid-u3 was added in 3.10, so it would be nice to
backport this fix at least that far.
Signed-off-by: Steev Klimaszewski <steev@gentoo.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 1ddf0b1b11aa8a90cef6706e935fc31c75c406ba upstream.
In Linux 3.18 and below, GCC hoists the lsl instructions in the
pvclock code all the way to the beginning of __vdso_clock_gettime,
slowing the non-paravirt case significantly. For unknown reasons,
presumably related to the removal of a branch, the performance issue
is gone as of
e76b027e6408 x86,vdso: Use LSL unconditionally for vgetcpu
but I don't trust GCC enough to expect the problem to stay fixed.
There should be no correctness issue, because the __getcpu calls in
__vdso_vlock_gettime were never necessary in the first place.
Note to stable maintainers: In 3.18 and below, depending on
configuration, gcc 4.9.2 generates code like this:
9c3: 44 0f 03 e8 lsl %ax,%r13d
9c7: 45 89 eb mov %r13d,%r11d
9ca: 0f 03 d8 lsl %ax,%ebx
This patch won't apply as is to any released kernel, but I'll send a
trivial backported version if needed.
[
Backported by Andy Lutomirski. Should apply to all affected
versions. This fixes a functionality bug as well as a performance
bug: buggy kernels can infinite loop in __vdso_clock_gettime on
affected compilers. See, for exammple:
https://bugzilla.redhat.com/show_bug.cgi?id=1178975
]
Fixes: 51c19b4f5927 x86: vdso: pvclock gettime support
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 394f56fe480140877304d342dec46d50dc823d46 upstream.
The theory behind vdso randomization is that it's mapped at a random
offset above the top of the stack. To avoid wasting a page of
memory for an extra page table, the vdso isn't supposed to extend
past the lowest PMD into which it can fit. Other than that, the
address should be a uniformly distributed address that meets all of
the alignment requirements.
The current algorithm is buggy: the vdso has about a 50% probability
of being at the very end of a PMD. The current algorithm also has a
decent chance of failing outright due to incorrect handling of the
case where the top of the stack is near the top of its PMD.
This fixes the implementation. The paxtest estimate of vdso
"randomisation" improves from 11 bits to 18 bits. (Disclaimer: I
don't know what the paxtest code is actually calculating.)
It's worth noting that this algorithm is inherently biased: the vdso
is more likely to end up near the end of its PMD than near the
beginning. Ideally we would either nix the PMD sharing requirement
or jointly randomize the vdso and the stack to reduce the bias.
In the mean time, this is a considerable improvement with basically
no risk of compatibility issues, since the allowed outputs of the
algorithm are unchanged.
As an easy test, doing this:
for i in `seq 10000`
do grep -P vdso /proc/self/maps |cut -d- -f1
done |sort |uniq -d
used to produce lots of output (1445 lines on my most recent run).
A tiny subset looks like this:
7fffdfffe000
7fffe01fe000
7fffe05fe000
7fffe07fe000
7fffe09fe000
7fffe0bfe000
7fffe0dfe000
Note the suspicious fe000 endings. With the fix, I get a much more
palatable 76 repeated addresses.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit a629df7eadffb03e6ce4a8616e62ea29fdf69b6b upstream.
Since most virtual machines raise this message once, it is a bit annoying.
Make it KERN_DEBUG severity.
Fixes: 7a2e8aaf0f6873b47bc2347f216ea5b0e4c258ab
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 8117ac6a6c2fa0f847ff6a21a1f32c8d2c8501d0 upstream.
Currently, when going idle, we set the flag indicating that we are in
nap mode (paca->kvm_hstate.hwthread_state) and then execute the nap
(or sleep or rvwinkle) instruction, all with the MMU on. This is bad
for two reasons: (a) the architecture specifies that those instructions
must be executed with the MMU off, and in fact with only the SF, HV, ME
and possibly RI bits set, and (b) this introduces a race, because as
soon as we set the flag, another thread can switch the MMU to a guest
context. If the race is lost, this thread will typically start looping
on relocation-on ISIs at 0xc...4400.
This fixes it by setting the MSR as required by the architecture before
setting the flag or executing the nap/sleep/rvwinkle instruction.
[ shreyas@linux.vnet.ibm.com: Edited to handle LE ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit cd32e2dcc9de6c27ecbbfc0e2079fb64b42bad5f upstream.
We have some code in udbg_uart_getc_poll() that tries to protect
against a NULL udbg_uart_in, but gets it all wrong.
Found with the LLVM static analyzer (scan-build).
Fixes: 309257484cc1 ("powerpc: Cleanup udbg_16550 and add support for LPC PIO-only UARTs")
Signed-off-by: Anton Blanchard <anton@samba.org>
[mpe: Add some newlines for readability while we're here]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit c9b08884c9c98929ec2d8abafd78e89062d01ee7 upstream.
The current code simply assumes Intel Arch PerfMon v2+ to have
the IA32_PERF_CAPABILITIES MSR; the SDM specifies that we should check
CPUID[1].ECX[15] (aka, FEATURE_PDCM) instead.
This was found by KVM which implements v2+ but didn't provide the
capabilities MSR. Change the code to DTRT; KVM will also implement the
MSR and return 0.
Cc: pbonzini@redhat.com
Reported-by: "Michael S. Tsirkin" <mst@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140203132903.GI8874@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 7ff4d90b4c24a03666f296c3d4878cd39001e81e upstream.
Today there are 3 instances of setgroups and due to an oversight their
permission checking has diverged. Add a common function so that
they may all share the same permission checking code.
This corrects the current oversight in the current permission checks
and adds a helper to avoid this in the future.
A user namespace security fix will update this new helper, shortly.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 3fb2f4237bb452eb4e98f6a5dbd5a445b4fed9d0 upstream.
It turns out that there's a lurking ABI issue. GCC, when
compiling this in a 32-bit program:
struct user_desc desc = {
.entry_number = idx,
.base_addr = base,
.limit = 0xfffff,
.seg_32bit = 1,
.contents = 0, /* Data, grow-up */
.read_exec_only = 0,
.limit_in_pages = 1,
.seg_not_present = 0,
.useable = 0,
};
will leave .lm uninitialized. This means that anything in the
kernel that reads user_desc.lm for 32-bit tasks is unreliable.
Revert the .lm check in set_thread_area(). The value never did
anything in the first place.
Fixes: 0e58af4e1d21 ("x86/tls: Disallow unusual TLS segments")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 29fa6825463c97e5157284db80107d1bfac5d77b upstream.
paravirt_enabled has the following effects:
- Disables the F00F bug workaround warning. There is no F00F bug
workaround any more because Linux's standard IDT handling already
works around the F00F bug, but the warning still exists. This
is only cosmetic, and, in any event, there is no such thing as
KVM on a CPU with the F00F bug.
- Disables 32-bit APM BIOS detection. On a KVM paravirt system,
there should be no APM BIOS anyway.
- Disables tboot. I think that the tboot code should check the
CPUID hypervisor bit directly if it matters.
- paravirt_enabled disables espfix32. espfix32 should *not* be
disabled under KVM paravirt.
The last point is the purpose of this patch. It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests. Fixes CVE-2014-8134.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 0e58af4e1d2166e9e33375a0f121e4867010d4f8 upstream.
Users have no business installing custom code segments into the
GDT, and segments that are not present but are otherwise valid
are a historical source of interesting attacks.
For completeness, block attempts to set the L bit. (Prior to
this patch, the L bit would have been silently dropped.)
This is an ABI break. I've checked glibc, musl, and Wine, and
none of them look like they'll have any trouble.
Note to stable maintainers: this is a hardening patch that fixes
no known bugs. Given the possibility of ABI issues, this
probably shouldn't be backported quickly.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 41bdc78544b8a93a9c6814b8bbbfef966272abbe upstream.
Installing a 16-bit RW data segment into the GDT defeats espfix.
AFAICT this will not affect glibc, Wine, or dosemu at all.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 152d44a853e42952f6c8a504fb1f8eefd21fd5fd upstream.
I used some 64 bit instructions when adding the 32 bit getcpu VDSO
function. Fix it.
Fixes: 18ad51dd342a ("powerpc: Add VDSO version of getcpu")
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 338b522ca43cfd32d11a370f4203bcd089c6c877 upstream.
With -cpu host, KVM reports LBR and extra_regs support, if the host has
support.
When the guest perf driver tries to access LBR or extra_regs MSR,
it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
So check the related MSRs access right once at initialization time to avoid
the error access at runtime.
For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
(for host kernel).
And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
Start the guest with -cpu host.
Run perf record with --branch-any or --branch-filter in guest to trigger LBR
Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
trigger offcore_rsp #GP
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Mark Davies <junk@eslaf.co.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 360743814c4082515581aa23ab1d8e699e1fbe88 upstream.
Instead of the arch specific quirk which we are deprecating
and that drivers don't understand.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 3f4aa45ceea5789a4aade536acc27f2e0d3da5e1 upstream.
We cannot restart cacheflush safely if a process provides user-defined
signal handler and signal is pending. In this case -EINTR is returned
and it is expected that process re-invokes syscall. However, there are
a few problems with that:
* looks like nobody bothers checking return value from cacheflush
* but if it did, we don't provide the restart address for that, so the
process has to use the same range again
* ...and again, what might lead to looping forever
So, remove cacheflush restarting code and terminate cache flushing
as early as fatal signal is pending.
Reported-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 995ab5189d1d7264e79e665dfa032a19b3ac646e upstream.
Under extremely rare conditions, in an MPCore node consisting of at
least 3 CPUs, two CPUs trying to perform a STREX to data on the same
shared cache line can enter a livelock situation.
This patch enables the HW mechanism that overcomes the bug. This fixes
the incorrect setup of the STREX backoff delay bit due to a wrong
description in the specification.
Note that enabling the STREX backoff delay mechanism is done by
leaving the bit *cleared*, while the bit was currently being set by
the proc-v7.S code.
[Thomas: adapt to latest mainline, slightly reword the commit log, add
stable markers.]
Fixes: de4901933f6d ("arm: mm: Add support for PJ4B cpu and init routines")
Signed-off-by: Nadav Haklai <nadavh@marvell.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit ef59a20ba375aeb97b3150a118318884743452a8 upstream.
According to the manuals I have, XScale auxiliary register should be
reached with opc_2 = 1 instead of crn = 1. cpu_xscale_proc_init
correctly uses c1, c0, 1 arguments, but cpu_xscale_do_suspend and
cpu_xscale_do_resume use c1, c1, 0. Correct suspend/resume functions to
also use c1, c0, 1.
The issue was primarily noticed thanks to qemu reporing "unsupported
instruction" on the pxa suspend path. Confirmed in PXA210/250 and PXA255
XScale Core manuals and in PXA270 and PXA320 Developers Guides.
Harware tested by me on tosa (pxa255). Robert confirmed on pxa270 board.
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 3b8a3c01096925a824ed3272601082289d9c23a5 upstream.
On pseries system (LPAR) xmon failed to enter when running in LE mode,
system is hunging. Inititating xmon will lead to such an output on the
console:
SysRq : Entering xmon
cpu 0x15: Vector: 0 at [c0000003f39ffb10]
pc: c00000000007ed7c: sysrq_handle_xmon+0x5c/0x70
lr: c00000000007ed7c: sysrq_handle_xmon+0x5c/0x70
sp: c0000003f39ffc70
msr: 8000000000009033
current = 0xc0000003fafa7180
paca = 0xc000000007d75e80 softe: 0 irq_happened: 0x01
pid = 14617, comm = bash
Bad kernel stack pointer fafb4b0 at eca7cc4
cpu 0x15: Vector: 300 (Data Access) at [c000000007f07d40]
pc: 000000000eca7cc4
lr: 000000000eca7c44
sp: fafb4b0
msr: 8000000000001000
dar: 10000000
dsisr: 42000000
current = 0xc0000003fafa7180
paca = 0xc000000007d75e80 softe: 0 irq_happened: 0x01
pid = 14617, comm = bash
cpu 0x15: Exception 300 (Data Access) in xmon, returning to main loop
xmon: WARNING: bad recursive fault on cpu 0x15
The root cause is that xmon is calling RTAS to turn off the surveillance
when entering xmon, and RTAS is requiring big endian parameters.
This patch is byte swapping the RTAS arguments when running in LE mode.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 415072a041bf50dbd6d56934ffc0cbbe14c97be8 upstream.
Instead of the arch specific quirk which we are deprecating
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 82975bc6a6df743b9a01810fb32cb65d0ec5d60b upstream.
x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but
not on non-paranoid returns. I suspect that this is a mistake and that
the code only works because int3 is paranoid.
Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround
for the x86 bug. With that bug fixed, we can remove _TIF_NOTIFY_RESUME
from the uprobes code.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 45e2a9d4701d8c624d4a4bcdd1084eae31e92f58 upstream.
When setting up permissions on kernel memory at boot, the end of the
PMD that was split from bss remained executable. It should be NX like
the rest. This performs a PMD alignment instead of a PAGE alignment to
get the correct span of memory.
Before:
---[ High Kernel Mapping ]---
...
0xffffffff8202d000-0xffffffff82200000 1868K RW GLB NX pte
0xffffffff82200000-0xffffffff82c00000 10M RW PSE GLB NX pmd
0xffffffff82c00000-0xffffffff82df5000 2004K RW GLB NX pte
0xffffffff82df5000-0xffffffff82e00000 44K RW GLB x pte
0xffffffff82e00000-0xffffffffc0000000 978M pmd
After:
---[ High Kernel Mapping ]---
...
0xffffffff8202d000-0xffffffff82200000 1868K RW GLB NX pte
0xffffffff82200000-0xffffffff82e00000 12M RW PSE GLB NX pmd
0xffffffff82e00000-0xffffffffc0000000 978M pmd
[ tglx: Changed it to roundup(_brk_end, PMD_SIZE) and added a comment.
We really should unmap the reminder along with the holes
caused by init,initdata etc. but thats a different issue ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20141114194737.GA3091@www.outflux.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 2cd3949f702692cf4c5d05b463f19cd706a92dd3 upstream.
We have some very similarly named command-line options:
arch/x86/kernel/cpu/common.c:__setup("noxsave", x86_xsave_setup);
arch/x86/kernel/cpu/common.c:__setup("noxsaveopt", x86_xsaveopt_setup);
arch/x86/kernel/cpu/common.c:__setup("noxsaves", x86_xsaves_setup);
__setup() is designed to match options that take arguments, like
"foo=bar" where you would have:
__setup("foo", x86_foo_func...);
The problem is that "noxsave" actually _matches_ "noxsaves" in
the same way that "foo" matches "foo=bar". If you boot an old
kernel that does not know about "noxsaves" with "noxsaves" on the
command line, it will interpret the argument as "noxsave", which
is not what you want at all.
This makes the "noxsave" handler only return success when it finds
an *exact* match.
[ tglx: We really need to make __setup() more robust. ]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20141111220133.FE053984@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit b645af2d5905c4e32399005b867987919cbfc3ae upstream.
It's possible for iretq to userspace to fail. This can happen because
of a bad CS, SS, or RIP.
Historically, we've handled it by fixing up an exception from iretq to
land at bad_iret, which pretends that the failed iret frame was really
the hardware part of #GP(0) from userspace. To make this work, there's
an extra fixup to fudge the gs base into a usable state.
This is suboptimal because it loses the original exception. It's also
buggy because there's no guarantee that we were on the kernel stack to
begin with. For example, if the failing iret happened on return from an
NMI, then we'll end up executing general_protection on the NMI stack.
This is bad for several reasons, the most immediate of which is that
general_protection, as a non-paranoid idtentry, will try to deliver
signals and/or schedule from the wrong stack.
This patch throws out bad_iret entirely. As a replacement, it augments
the existing swapgs fudge into a full-blown iret fixup, mostly written
in C. It's should be clearer and more correct.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 6f442be2fb22be02cafa606f1769fa1e6f894441 upstream.
On a 32-bit kernel, this has no effect, since there are no IST stacks.
On a 64-bit kernel, #SS can only happen in user code, on a failed iret
to user space, a canonical violation on access via RSP or RBP, or a
genuine stack segment violation in 32-bit kernel code. The first two
cases don't need IST, and the latter two cases are unlikely fatal bugs,
and promoting them to double faults would be fine.
This fixes a bug in which the espfix64 code mishandles a stack segment
violation.
This saves 4k of memory per CPU and a tiny bit of code.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit af726f21ed8af2cdaa4e93098dc211521218ae65 upstream.
There's nothing special enough about the espfix64 double fault fixup to
justify writing it in assembly. Move it to C.
This also fixes a bug: if the double fault came from an IST stack, the
old asm code would return to a partially uninitialized stack frame.
Fixes: 3891a04aafd668686239349ea58f3314ea2af86b
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 26927f76499849e095714452b8a4e09350f6a3b9 upstream.
If SERIAL_8250 is compiled as a module, the platform specific setup
for Loongson will be a module too, and it will not work very well.
At least on Loongson 3 it will trigger a build failure,
since loongson_sysconf is not exported to modules.
Fix by making the platform specific serial code always built-in.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reported-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Markos Chandras <Markos.Chandras@imgtec.com>
Patchwork: https://patchwork.linux-mips.org/patch/8533/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit bbaf113a481b6ce32444c125807ad3618643ce57 upstream.
Fix incorrect cast that always results in wrong address for the new
frame on 64-bit kernels.
Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/8110/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit c1118b3602c2329671ad5ec8bdf8e374323d6343 upstream.
On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.
In that case, KVM will fail to patch VMCALL instructions to VMMCALL
as required on AMD processors.
The failure mode is currently a divide-by-zero exception, which obviously
is a KVM bug that has to be fixed. However, picking the right instruction
between VMCALL and VMMCALL will be faster and will help if you cannot upgrade
the hypervisor.
Reported-by: Chris Webb <chris@arachsys.com>
Tested-by: Chris Webb <chris@arachsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit 5a2b59d3993e8ca4f7788a48a23e5cb303f26954 ]
We are reading the memory location, so we have to have a memory
constraint in there purely for the sake of showing the data flow
to the compiler.
Reported-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit a2b9e6c1a35afcc0973acb72e591c714e78885ff upstream.
Commit fc3a9157d314 ("KVM: X86: Don't report L2 emulation failures to
user-space") disabled the reporting of L2 (nested guest) emulation failures to
userspace due to race-condition between a vmexit and the instruction emulator.
The same rational applies also to userspace applications that are permitted by
the guest OS to access MMIO area or perform PIO.
This patch extends the current behavior - of injecting a #UD instead of
reporting it to userspace - also for guest userspace code.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 888be25402021a425da3e85e2d5a954d7509286e upstream.
If we are running BE8, the data and instruction endianness do not
match, so use <asm/opcodes.h> to correctly translate memory accesses
into ARM instructions.
Acked-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
[taras.kondratiuk@linaro.org: fixed Thumb instruction fetch order]
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
[wangnan: backport to 3.10 and 3.14:
- adjust context
- backport all changes on arch/arm/kernel/probes.c to
arch/arm/kernel/kprobes-common.c since we don't have
commit c18377c303787ded44b7decd7dee694db0f205e9.
- After the above adjustments, becomes same to Taras Kondratiuk's
original patch:
http://lists.linaro.org/pipermail/linaro-kernel/2014-January/010346.html
]
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 2fe749f50b0bec07650ef135b29b1f55bf543869 upstream.
Switch over the msgctl, shmat, shmctl and semtimedop syscalls to use the compat
layer. The problem was found with the debian procenv package, which called
shmctl(0, SHM_INFO, &info);
in which the shmctl syscall then overwrote parts of the surrounding areas on
the stack on which the info variable was stored and thus lead to a segfault
later on.
Additionally fix the definition of struct shminfo64 to use unsigned longs like
the other architectures. This has no impact on userspace since we only have a
32bit userspace up to now.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 97fc15436b36ee3956efad83e22a557991f7d19d upstream.
ARM64 currently doesn't fix up faults on the single-byte (strb) case of
__clear_user... which means that we can cause a nasty kernel panic as an
ordinary user with any multiple PAGE_SIZE+1 read from /dev/zero.
i.e.: dd if=/dev/zero of=foo ibs=1 count=1 (or ibs=65537, etc.)
This is a pretty obscure bug in the general case since we'll only
__do_kernel_fault (since there's no extable entry for pc) if the
mmap_sem is contended. However, with CONFIG_DEBUG_VM enabled, we'll
always fault.
if (!down_read_trylock(&mm->mmap_sem)) {
if (!user_mode(regs) && !search_exception_tables(regs->pc))
goto no_context;
retry:
down_read(&mm->mmap_sem);
} else {
/*
* The above down_read_trylock() might have succeeded in
* which
* case, we'll have missed the might_sleep() from
* down_read().
*/
might_sleep();
if (!user_mode(regs) && !search_exception_tables(regs->pc))
goto no_context;
}
Fix that by adding an extable entry for the strb instruction, since it
touches user memory, similar to the other stores in __clear_user.
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Reported-by: Miloš Prchlík <mprchlik@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 238962ac71910d6c20162ea5230685fead1836a4 upstream.
To speed up decompression, the decompressor sets up a flat, cacheable
mapping of memory. However, when there is insufficient space to hold
the page tables for this mapping, we don't bother to enable the caches
and subsequently skip all the cache maintenance hooks.
Skipping the cache maintenance before jumping to the relocated code
allows the processor to predict the branch and populate the I-cache
with stale data before the relocation loop has completed (since a
bootloader may have SCTLR.I set, which permits normal, cacheable
instruction fetches regardless of SCTLR.M).
This patch moves the cache maintenance check into the maintenance
routines themselves, allowing the v6/v7 versions to invalidate the
I-cache regardless of the MMU state.
Reported-by: Marc Carino <marc.ceeeee@gmail.com>
Tested-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 08b964ff3c51b10aaf2e6ba639f40054c09f0f7a upstream.
The kuser helpers page is not set up on non-MMU systems, so it does
not make sense to allow CONFIG_KUSER_HELPERS to be enabled when
CONFIG_MMU=n. Allowing it to be set on !MMU results in an oops in
set_tls (used in execve and the arm_syscall trap handler):
Unhandled exception: IPSR = 00000005 LR = fffffff1
CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-rc1-00041-ga30465a #216
task: 8b838000 ti: 8b82a000 task.ti: 8b82a000
PC is at flush_thread+0x32/0x40
LR is at flush_thread+0x21/0x40
pc : [<8f00157a>] lr : [<8f001569>] psr: 4100000b
sp : 8b82be20 ip : 00000000 fp : 8b83c000
r10: 00000001 r9 : 88018c84 r8 : 8bb85000
r7 : 8b838000 r6 : 00000000 r5 : 8bb77400 r4 : 8b82a000
r3 : ffff0ff0 r2 : 8b82a000 r1 : 00000000 r0 : 88020354
xPSR: 4100000b
CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-rc1-00041-ga30465a #216
[<8f002bc1>] (unwind_backtrace) from [<8f002033>] (show_stack+0xb/0xc)
[<8f002033>] (show_stack) from [<8f00265b>] (__invalid_entry+0x4b/0x4c)
As best I can tell this issue existed for the set_tls ARM syscall
before commit fbfb872f5f41 "ARM: 8148/1: flush TLS and thumbee
register state during exec" consolidated the TLS manipulation code
into the set_tls helper function, but now that we're using it to flush
register state during execve, !MMU users encounter the oops at the
first exec.
Prevent CONFIG_MMU=n configurations from enabling
CONFIG_KUSER_HELPERS.
Fixes: fbfb872f5f41 (ARM: 8148/1: flush TLS and thumbee register state during exec)
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Reported-by: Stefan Agner <stefan@agner.ch>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 2651cc6974d47fc43bef1cd8cd26966e4f5ba306 upstream.
Userspace actually passes single parameter (path name) to the umount
syscall, so new umount just fails. Fix it by requesting old umount
syscall implementation and re-wiring umount to it.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 81f49a8fd7088cfcb588d182eeede862c0e3303e upstream.
is_compat_task() is the wrong check for audit arch; the check should
be is_ia32_task(): x32 syscalls should be AUDIT_ARCH_X86_64, not
AUDIT_ARCH_I386.
CONFIG_AUDITSYSCALL is currently incompatible with x32, so this has
no visible effect.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/a0138ed8c709882aec06e4acc30bfa9b623b8717.1409954077.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit 1a17fdc4f4ed06b63fac1937470378a5441a663a ]
Atomicity between xchg and cmpxchg cannot be guaranteed when xchg is
implemented with a swap and cmpxchg is implemented with locks.
Without this, e.g. mcs_spin_lock and mcs_spin_unlock are broken.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit ab5c780913bca0a5763ca05dd5c2cb5cb08ccb26 ]
Otherwise rcu_irq_{enter,exit}() do not happen and we get dumps like:
====================
[ 188.275021] ===============================
[ 188.309351] [ INFO: suspicious RCU usage. ]
[ 188.343737] 3.18.0-rc3-00068-g20f3963-dirty #54 Not tainted
[ 188.394786] -------------------------------
[ 188.429170] include/linux/rcupdate.h:883 rcu_read_lock() used
illegally while idle!
[ 188.505235]
other info that might help us debug this:
[ 188.554230]
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
[ 188.637587] RCU used illegally from extended quiescent state!
[ 188.690684] 3 locks held by swapper/7/0:
[ 188.721932] #0: (&x->wait#11){......}, at: [<0000000000495de8>] complete+0x8/0x60
[ 188.797994] #1: (&p->pi_lock){-.-.-.}, at: [<000000000048510c>] try_to_wake_up+0xc/0x400
[ 188.881343] #2: (rcu_read_lock){......}, at: [<000000000048a910>] select_task_rq_fair+0x90/0xb40
[ 188.973043]stack backtrace:
[ 188.993879] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.18.0-rc3-00068-g20f3963-dirty #54
[ 189.076187] Call Trace:
[ 189.089719] [0000000000499360] lockdep_rcu_suspicious+0xe0/0x100
[ 189.147035] [000000000048a99c] select_task_rq_fair+0x11c/0xb40
[ 189.202253] [00000000004852d8] try_to_wake_up+0x1d8/0x400
[ 189.252258] [000000000048554c] default_wake_function+0xc/0x20
[ 189.306435] [0000000000495554] __wake_up_common+0x34/0x80
[ 189.356448] [00000000004955b4] __wake_up_locked+0x14/0x40
[ 189.406456] [0000000000495e08] complete+0x28/0x60
[ 189.448142] [0000000000636e28] blk_end_sync_rq+0x8/0x20
[ 189.496057] [0000000000639898] __blk_mq_end_request+0x18/0x60
[ 189.550249] [00000000006ee014] scsi_end_request+0x94/0x180
[ 189.601286] [00000000006ee334] scsi_io_completion+0x1d4/0x600
[ 189.655463] [00000000006e51c4] scsi_finish_command+0xc4/0xe0
[ 189.708598] [00000000006ed958] scsi_softirq_done+0x118/0x140
[ 189.761735] [00000000006398ec] __blk_mq_complete_request_remote+0xc/0x20
[ 189.827383] [00000000004c75d0] generic_smp_call_function_single_interrupt+0x150/0x1c0
[ 189.906581] [000000000043e514] smp_call_function_single_client+0x14/0x40
====================
Based almost entirely upon a patch by Paul E. McKenney.
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit 7da89a2a3776442a57e918ca0b8678d1b16a7072 ]
Meelis Roos reports crashes during bootup on a V480 that look like
this:
====================
[ 61.300577] PCI: Scanning PBM /pci@9,600000
[ 61.304867] schizo f009b070: PCI host bridge to bus 0003:00
[ 61.310385] pci_bus 0003:00: root bus resource [io 0x7ffe9000000-0x7ffe9ffffff] (bus address [0x0000-0xffffff])
[ 61.320515] pci_bus 0003:00: root bus resource [mem 0x7fb00000000-0x7fbffffffff] (bus address [0x00000000-0xffffffff])
[ 61.331173] pci_bus 0003:00: root bus resource [bus 00]
[ 61.385344] Unable to handle kernel NULL pointer dereference
[ 61.390970] tsk->{mm,active_mm}->context = 0000000000000000
[ 61.396515] tsk->{mm,active_mm}->pgd = fff000b000002000
[ 61.401716] \|/ ____ \|/
[ 61.401716] "@'/ .. \`@"
[ 61.401716] /_| \__/ |_\
[ 61.401716] \__U_/
[ 61.416362] swapper/0(0): Oops [#1]
[ 61.419837] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc1-00422-g2cc9188-dirty #24
[ 61.427975] task: fff000b0fd8e9c40 ti: fff000b0fd928000 task.ti: fff000b0fd928000
[ 61.435426] TSTATE: 0000004480e01602 TPC: 00000000004455e4 TNPC: 00000000004455e8 Y: 00000000 Not tainted
[ 61.445230] TPC: <schizo_pcierr_intr+0x104/0x560>
[ 61.449897] g0: 0000000000000000 g1: 0000000000000000 g2: 0000000000a10f78 g3: 000000000000000a
[ 61.458563] g4: fff000b0fd8e9c40 g5: fff000b0fdd82000 g6: fff000b0fd928000 g7: 000000000000000a
[ 61.467229] o0: 000000000000003d o1: 0000000000000000 o2: 0000000000000006 o3: fff000b0ffa5fc7e
[ 61.475894] o4: 0000000000060000 o5: c000000000000000 sp: fff000b0ffa5f3c1 ret_pc: 00000000004455cc
[ 61.484909] RPC: <schizo_pcierr_intr+0xec/0x560>
[ 61.489500] l0: fff000b0fd8e9c40 l1: 0000000000a20800 l2: 0000000000000000 l3: 000000000119a430
[ 61.498164] l4: 0000000001742400 l5: 00000000011cfbe0 l6: 00000000011319c0 l7: fff000b0fd8ea348
[ 61.506830] i0: 0000000000000000 i1: fff000b0fdb34000 i2: 0000000320000000 i3: 0000000000000000
[ 61.515497] i4: 00060002010b003f i5: 0000040004e02000 i6: fff000b0ffa5f481 i7: 00000000004a9920
[ 61.524175] I7: <handle_irq_event_percpu+0x40/0x140>
[ 61.529099] Call Trace:
[ 61.531531] [00000000004a9920] handle_irq_event_percpu+0x40/0x140
[ 61.537681] [00000000004a9a58] handle_irq_event+0x38/0x80
[ 61.543145] [00000000004ac77c] handle_fasteoi_irq+0xbc/0x200
[ 61.548860] [00000000004a9084] generic_handle_irq+0x24/0x40
[ 61.554500] [000000000042be0c] handler_irq+0xac/0x100
====================
The problem is that pbm->pci_bus->self is NULL.
This code is trying to go through the standard PCI config space
interfaces to read the PCI controller's PCI_STATUS register.
This doesn't work, because we more often than not do not enumerate
the PCI controller as a bonafide PCI device during the OF device
node scan. Therefore bus->self remains NULL.
Existing common code for PSYCHO and PSYCHO-like PCI controllers
handles this properly, by doing the config space access directly.
Do the same here, pbm->pci_ops->{read,write}().
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit d0aedcd4f14a22e23b313f42b7e6e6ebfc0fbc31 ]
vio_dring_avail() will allow use of every dring entry, but when the last
entry is allocated then dr->prod == dr->cons which is indistinguishable from
the ring empty condition. This causes the next allocation to reuse an entry.
When this happens in sunvdc, the server side vds driver begins nack'ing the
messages and ends up resetting the ldc channel. This problem does not effect
sunvnet since it checks for < 2.
The fix here is to just never allocate the very last dring slot so that full
and empty are not the same condition. The request start path was changed to
check for the ring being full a bit earlier, and to stop the blk_queue if
there is no space left. The blk_queue will be restarted once the ring is
only half full again. The number of ring entries was increased to 512 which
matches the sunvnet and Solaris vdc drivers, and greatly reduces the
frequency of hitting the ring full condition and the associated blk_queue
stop/starting. The checks in sunvent were adjusted to account for
vio_dring_avail() returning 1 less.
Orabug: 19441666
OraBZ: 14983
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
[ Upstream commit 9bce21828d54a95143f1b74619705c2dd8e88b92 ]
Interpret the media type from v1.1 protocol to support CDROM/DVD.
For v1.0 protocol, a disk's size continues to be calculated from the
geometry returned by the vdisk server. The geometry returned by the server
can be less than the actual number of sectors available in the backing
image/device due to the rounding in the division used to compute the
geometry in the vdisk server.
In v1.1 protocol a disk's actual size in sectors is returned during the
handshake. Use this size when v1.1 protocol is negotiated. Since this size
will always be larger than the former geometry computed size, disks created
under v1.0 will be forwards compatible to v1.1, but not vice versa.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 653bc77af60911ead1f423e588f54fc2547c4957 upstream.
Rusty noticed a Really Bad Bug (tm) in my NT fix. The entry code
reads out of bounds, causing the NT fix to be unreliable. But, and
this is much, much worse, if your stack is somehow just below the
top of the direct map (or a hole), you read out of bounds and crash.
Excerpt from the crash:
[ 1.129513] RSP: 0018:ffff88001da4bf88 EFLAGS: 00010296
2b:* f7 84 24 90 00 00 00 testl $0x4000,0x90(%rsp)
That read is deterministically above the top of the stack. I
thought I even single-stepped through this code when I wrote it to
check the offset, but I clearly screwed it up.
Fixes: 8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from userspace")
Reported-by: Rusty Russell <rusty@ozlabs.org>
Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
Requested-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|
|
commit 10ccaf178b2b961d8bca252d647ed7ed8aae2a20 upstream.
In powerpc pseries platform dlpar operations, use device_online() and
device_offline() instead of cpu_up() and cpu_down().
Calling cpu_up/down() directly does not update the cpu device offline
field, which is used to online/offline a cpu from sysfs. Calling
device_online/offline() instead keeps the sysfs cpu online value
correct. The hotplug lock, which is required to be held when calling
device_online/offline(), is already held when dlpar_online/offline_cpu()
are called, since they are called only from cpu_probe|release_store().
This patch fixes errors on phyp (PowerVM) systems that have cpu(s)
added/removed using dlpar operations; without this patch, the
/sys/devices/system/cpu/cpuN/online nodes do not correctly show the
online state of added/removed cpus.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Fixes: 0902a9044fa5 ("Driver core: Use generic offline/online for CPU offline/online")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
|