Age | Commit message (Collapse) | Author |
|
|
|
commit f05819df10d7b09f6d1eb6f8534a8f68e5a4fe61 upstream.
The following sequence of commands:
i=`keyctl add user a a @s`
keyctl request2 keyring foo bar @t
keyctl unlink $i @s
tries to invoke an upcall to instantiate a keyring if one doesn't already
exist by that name within the user's keyring set. However, if the upcall
fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
other error code. When the key is garbage collected, the key destroy
function is called unconditionally and keyring_destroy() uses list_empty()
on keyring->type_data.link - which is in a union with reject_error.
Subsequently, the kernel tries to unlink the keyring from the keyring names
list - which oopses like this:
BUG: unable to handle kernel paging request at 00000000ffffff8a
IP: [<ffffffff8126e051>] keyring_destroy+0x3d/0x88
...
Workqueue: events key_garbage_collector
...
RIP: 0010:[<ffffffff8126e051>] keyring_destroy+0x3d/0x88
RSP: 0018:ffff88003e2f3d30 EFLAGS: 00010203
RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40
RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000
R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900
R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000
...
CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0
...
Call Trace:
[<ffffffff8126c756>] key_gc_unused_keys.constprop.1+0x5d/0x10f
[<ffffffff8126ca71>] key_garbage_collector+0x1fa/0x351
[<ffffffff8105ec9b>] process_one_work+0x28e/0x547
[<ffffffff8105fd17>] worker_thread+0x26e/0x361
[<ffffffff8105faa9>] ? rescuer_thread+0x2a8/0x2a8
[<ffffffff810648ad>] kthread+0xf3/0xfb
[<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
[<ffffffff815f2ccf>] ret_from_fork+0x3f/0x70
[<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
Note the value in RAX. This is a 32-bit representation of -ENOKEY.
The solution is to only call ->destroy() if the key was successfully
instantiated.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
[carnil: Backported for 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 94c4554ba07adbdde396748ee7ae01e86cf2d8d7 upstream.
There appears to be a race between:
(1) key_gc_unused_keys() which frees key->security and then calls
keyring_destroy() to unlink the name from the name list
(2) find_keyring_by_name() which calls key_permission(), thus accessing
key->security, on a key before checking to see whether the key usage is 0
(ie. the key is dead and might be cleaned up).
Fix this by calling ->destroy() before cleaning up the core key data -
including key->security.
Reported-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
[carnil: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 54a20552e1eae07aa240fa370a0293e006b5faed upstream.
It was found that a guest can DoS a host by triggering an infinite
stream of "alignment check" (#AC) exceptions. This causes the
microcode to enter an infinite loop where the core never receives
another interrupt. The host kernel panics pretty quickly due to the
effects (CVE-2015-5307).
Signed-off-by: Eric Northup <digitaleric@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2:
- Add definition of AC_VECTOR
- Adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
mount
commit a41cbe86df3afbc82311a1640e20858c0cd7e065 upstream.
A test case is as the description says:
open(foobar, O_WRONLY);
sleep() --> reboot the server
close(foobar)
The bug is because in nfs4state.c in nfs4_reclaim_open_state() a few
line before going to restart, there is
clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &state->flags).
NFS4CLNT_RECLAIM_NOGRACE is a flag for the client states not open
owner states. Value of NFS4CLNT_RECLAIM_NOGRACE is 4 which is the
value of NFS_O_WRONLY_STATE in nfs4_state->flags. So clearing it wipes
out state and when we go to close it, “call_close” doesn’t get set as
state flag is not set and CLOSE doesn’t go on the wire.
Signed-off-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 436c2a5036b6ffe813310df2cf327d3b69be0734 ]
commit 3cc81d85ee01 ("asix: Don't reset PHY on if_up for ASIX 88772")
causes the ethernet on Arndale to no longer function. This appears to
be because the Arndale ethernet requires a full reset before it will
function correctly, however simply reverting the above patch causes
problems with ethtool settings getting reset.
It seems the problem is that the ethernet is not properly reset during
bind, and indeed the code in ax88772_bind that resets the device is a
very small subset of the actual ax88772_reset function. This patch uses
ax88772_reset in place of the existing reset code in ax88772_bind which
removes some code duplication and fixes the ethernet on Arndale.
It is still possible that the original patch causes some issues with
suspend and resume but that seems like a separate issue and I haven't
had a chance to test that yet.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Tested-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 3cc81d85ee01e5a0b7ea2f4190e2ed1165f53c31 ]
I've noticed every time the interface is set to 'up,', the kernel
reports that the link speed is set to 100 Mbps/Full Duplex, even
when ethtool is used to set autonegotiation to 'off', half
duplex, 10 Mbps.
It can be tested by:
ifconfig eth0 down
ethtool -s eth0 autoneg off speed 10 duplex half
ifconfig eth0 up
Then checking 'dmesg' for the link speed.
Signed-off-by: Michel Stam <m.stam@fugro.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 077cb37fcf6f00a45f375161200b5ee0cd4e937b ]
It seems that kernel memory can leak into userspace by a
kmalloc, ethtool_get_strings, then copy_to_user sequence.
Avoid this by using kcalloc to zero fill the copied buffer.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 31b33dfb0a144469dd805514c9e63f4993729a48 ]
Earlier patch 6ae459bda tried to detect void ckecksum partial
skb by comparing pull length to checksum offset. But it does
not work for all cases since checksum-offset depends on
updates to skb->data.
Following patch fixes it by validating checksum start offset
after skb-data pointer is updated. Negative value of checksum
offset start means there is no need to checksum.
Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
Reported-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 6ae459bdaaeebc632b16e54dcbabb490c6931d61 ]
VXLAN device can receive skb with checksum partial. But the checksum
offset could be in outer header which is pulled on receive. This results
in negative checksum offset for the skb. Such skb can cause the assert
failure in skb_checksum_help(). Following patch fixes the bug by setting
checksum-none while pulling outer header.
Following is the kernel panic msg from old kernel hitting the bug.
------------[ cut here ]------------
kernel BUG at net/core/dev.c:1906!
RIP: 0010:[<ffffffff81518034>] skb_checksum_help+0x144/0x150
Call Trace:
<IRQ>
[<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch]
[<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch]
[<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch]
[<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch]
[<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch]
[<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch]
[<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch]
[<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0
[<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0
[<ffffffff8157ba7a>] udp_rcv+0x1a/0x20
[<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280
[<ffffffff81550128>] ip_local_deliver+0x88/0x90
[<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370
[<ffffffff81550365>] ip_rcv+0x235/0x300
[<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620
[<ffffffff8151c360>] netif_receive_skb+0x80/0x90
[<ffffffff81459935>] virtnet_poll+0x555/0x6f0
[<ffffffff8151cd04>] net_rx_action+0x134/0x290
[<ffffffff810683d8>] __do_softirq+0xa8/0x210
[<ffffffff8162fe6c>] call_softirq+0x1c/0x30
[<ffffffff810161a5>] do_softirq+0x65/0xa0
[<ffffffff810687be>] irq_exit+0x8e/0xb0
[<ffffffff81630733>] do_IRQ+0x63/0xe0
[<ffffffff81625f2e>] common_interrupt+0x6e/0x6e
Reported-by: Anupam Chanda <achanda@vmware.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
Without this length argument, we can read past the end of the iovec in
memcpy_toiovec because we have no way of knowing the total length of the
iovec's buffers.
This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb
csum races when peeking") has been backported but that don't have the
ioviter conversion, which is almost all the stable trees <= 3.18.
This also fixes a kernel crash for NFS servers when the client uses
-onfsvers=3,proto=udp to mount the export.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
[bwh: Backported to 3.2: adjust context in include/linux/skbuff.h]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 80e0b6e8a001361316a2d62b748fe677ec46b860 upstream.
We accidentally declared pid_alive without any extern/inline connotation.
Some platforms were fine with this, some like ia64 and mips were very angry.
If the function is inline, the prototype should be inline!
on ia64:
include/linux/sched.h:1718: warning: 'pid_alive' declared inline after
being called
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Neal Gompa <ngompa13@gmail.com>
|
|
commit 2280521719e81919283b82902ac24058f87dfc1b upstream.
When pci_pool_alloc fails in mvs_task_prep then task->lldd_task stays
NULL but it's later used in mvs_abort_task as slot which is passed
to mvs_slot_task_free causing NULL pointer dereference.
Just return from mvs_slot_task_free when passed with NULL slot.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101891
Signed-off-by: Dāvis Mosāns <davispuh@gmail.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c340702ca26a628832fade4f133d8160a55c29cc upstream.
When a write fails and a bad-block-list is present, we can
update the bad-block-list instead of writing the data. If
this succeeds then it is OK clear the relevant bitmap-bit as
no further 'sync' of the block is needed.
However if writing the bad-block-list fails then we need to
treat the write as failed and particularly must not clear
the bitmap bit. Otherwise the device can be re-added (after
any hardware connection issues are resolved) and because the
relevant bit in the bitmap is clear, that block will not be
resynced. This leads to data corruption.
We already delay the final bio_endio() on the write until
the bad-block-list is written so that when the write
returns: either that data is safe, the bad-block record is
safe, or the fact that the device is faulty is safe.
However we *don't* delay the clearing of the bitmap, so the
bitmap bit can be recorded as cleared before we know if the
bad-block-list was written safely.
So: delay that until the write really is safe.
i.e. move the call to close_write() until just before
calling bio_endio(), and recheck the 'is array degraded'
status before making that call.
This bug goes back to v3.1 when bad-block-lists were
introduced, though it only affects arrays created with
mdadm-3.3 or later as only those have bad-block lists.
Backports will require at least
Commit: 95af587e95aa ("md/raid10: ensure device failure recorded before write request returns.")
as well. I'll send that to 'stable' separately.
Note that of the two tests of R10BIO_WriteError that this
patch adds, the first is certain to fail and the second is
certain to succeed. However doing it this way makes the
patch more obviously correct. I will tidy the code up in a
future merge window.
Reported-by: Nate Dailey <nate.dailey@stratus.com>
Fixes: bd870a16c594 ("md/raid10: Handle write errors by updating badblock log.")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 95af587e95aacb9cfda4a9641069a5244a540dc8 upstream.
When a write to one of the legs of a RAID10 fails, the failure is
recorded in the metadata of the other legs so that after a restart
the data on the failed drive wont be trusted even if that drive seems
to be working again (maybe a cable was unplugged).
Currently there is no interlock between the write request completing
and the metadata update. So it is possible that the write will
complete, the app will confirm success in some way, and then the
machine will crash before the metadata update completes.
This is an extremely small hole for a racy to fit in, but it is
theoretically possible and so should be closed.
So:
- set MD_CHANGE_PENDING when requesting a metadata update for a
failed device, so we can know with certainty when it completes
- queue requests that experienced an error on a new queue which
is only processed after the metadata update completes
- call raid_end_bio_io() on bios in that queue when the time comes.
Signed-off-by: NeilBrown <neilb@suse.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit bd8688a199b864944bf62eebed0ca13b46249453 upstream.
When a write fails and a bad-block-list is present, we can
update the bad-block-list instead of writing the data. If
this succeeds then it is OK clear the relevant bitmap-bit as
no further 'sync' of the block is needed.
However if writing the bad-block-list fails then we need to
treat the write as failed and particularly must not clear
the bitmap bit. Otherwise the device can be re-added (after
any hardware connection issues are resolved) and because the
relevant bit in the bitmap is clear, that block will not be
resynced. This leads to data corruption.
We already delay the final bio_endio() on the write until
the bad-block-list is written so that when the write
returns: either that data is safe, the bad-block record is
safe, or the fact that the device is faulty is safe.
However we *don't* delay the clearing of the bitmap, so the
bitmap bit can be recorded as cleared before we know if the
bad-block-list was written safely.
So: delay that until the write really is safe.
i.e. move the call to close_write() until just before
calling bio_endio(), and recheck the 'is array degraded'
status before making that call.
This bug goes back to v3.1 when bad-block-lists were
introduced, though it only affects arrays created with
mdadm-3.3 or later as only those have bad-block lists.
Backports will require at least
Commit: 55ce74d4bfe1 ("md/raid1: ensure device failure recorded before write request returns.")
as well. I'll send that to 'stable' separately.
Note that of the two tests of R1BIO_WriteError that this
patch adds, the first is certain to fail and the second is
certain to succeed. However doing it this way makes the
patch more obviously correct. I will tidy the code up in a
future merge window.
Reported-and-tested-by: Nate Dailey <nate.dailey@stratus.com>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Fixes: cd5ff9a16f08 ("md/raid1: Handle write errors by updating badblock log.")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 55ce74d4bfe1b9444436264c637f39a152d1e5ac upstream.
When a write to one of the legs of a RAID1 fails, the failure is
recorded in the metadata of the other leg(s) so that after a restart
the data on the failed drive wont be trusted even if that drive seems
to be working again (maybe a cable was unplugged).
Similarly when we record a bad-block in response to a write failure,
we must not let the write complete until the bad-block update is safe.
Currently there is no interlock between the write request completing
and the metadata update. So it is possible that the write will
complete, the app will confirm success in some way, and then the
machine will crash before the metadata update completes.
This is an extremely small hole for a racy to fit in, but it is
theoretically possible and so should be closed.
So:
- set MD_CHANGE_PENDING when requesting a metadata update for a
failed device, so we can know with certainty when it completes
- queue requests that experienced an error on a new queue which
is only processed after the metadata update completes
- call raid_end_bio_io() on bios in that queue when the time comes.
Signed-off-by: NeilBrown <neilb@suse.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 4dcb8b57df3593dcb20481d9d6cf79d1dc1534be upstream.
btree_split_beneath()'s error path had an outstanding FIXME that speaks
directly to the potential for _not_ cleaning up a previously allocated
bufio-backed block.
Fix this by releasing the previously allocated bufio block using
unlock_block().
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 2871c69e025e8bc507651d5a9cf81a8a7da9d24b upstream.
Commit 4c7e309340ff ("dm btree remove: fix bug in redistribute3") wasn't
a complete fix for redistribute3().
The redistribute3 function takes 3 btree nodes and shares out the entries
evenly between them. If the three nodes in total contained
(MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting
rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in
the center.
Fix this issue by being more careful about calculating the target number
of entries for the left and right nodes.
Unit tested in userspace using this program:
https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.c
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 1acea4f6ce1b1c0941438aca75dd2e5c6b09db60 upstream.
We can't rely on PPPOX_ZOMBIE to decide whether to clear po->pppoe_dev.
PPPOX_ZOMBIE can be set by pppoe_disc_rcv() even when po->pppoe_dev is
NULL. So we have no guarantee that (sk->sk_state & PPPOX_ZOMBIE) implies
(po->pppoe_dev != NULL).
Since we're releasing a PPPoE socket, we want to release the pppoe_dev
if it exists and reset sk_state to PPPOX_DEAD, no matter the previous
value of sk_state. So we can just check for po->pppoe_dev and avoid any
assumption on sk->sk_state.
Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 296291cdd1629c308114504b850dc343eabc2782 upstream.
Currently a simple program below issues a sendfile(2) system call which
takes about 62 days to complete in my test KVM instance.
int fd;
off_t off = 0;
fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644);
ftruncate(fd, 2);
lseek(fd, 0, SEEK_END);
sendfile(fd, fd, &off, 0xfffffff);
Now you should not ask kernel to do a stupid stuff like copying 256MB in
2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin
should have a way to stop you.
We actually do have a check for fatal_signal_pending() in
generic_perform_write() which triggers in this path however because we
always succeed in writing something before the check is done, we return
value > 0 from generic_perform_write() and thus the information about
signal gets lost.
Fix the problem by doing the signal check before writing anything. That
way generic_perform_write() returns -EINTR, the error gets propagated up
and the sendfile loop terminates early.
Signed-off-by: Jan Kara <jack@suse.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8832317f662c06f5c06e638f57bfe89a71c9b266 upstream.
Currently we do not validate rtas.entry before calling enter_rtas(). This
leads to a kernel oops when user space calls rtas system call on a powernv
platform (see below). This patch adds code to validate rtas.entry before
making enter_rtas() call.
Oops: Exception in kernel mode, sig: 4 [#1]
SMP NR_CPUS=1024 NUMA PowerNV
task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000
NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140
REGS: c0000007e1a7b920 TRAP: 0e40 Not tainted (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le)
MSR: 1000000000081000 <HV,ME> CR: 00000000 XER: 00000000
CFAR: c000000000009c0c SOFTE: 0
NIP [0000000000000000] (null)
LR [0000000000009c14] 0x9c14
Call Trace:
[c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable)
[c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0
[c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98
Fixes: 55190f88789a ("powerpc: Add skeleton PowerNV platform")
Reported-by: NAGESWARA R. SASTRY <nasastry@in.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[mpe: Reword change log, trim oops, and add stable + fixes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 2a6c521bb41ce862e43db46f52e7681d33e8d771 upstream.
On nv50+, we restrict the valid domains to just the one where the buffer
was originally created. However after the buffer is evicted to system
memory, we might move it back to a different domain that was not
originally valid. When sharing the buffer and retrieving its GEM_INFO
data, we still want the domain that will be valid for this buffer in a
pushbuf, not the one where it currently happens to be.
This resolves fdo#92504 and several others. These are due to suspend
evicting all buffers, making it more likely that they temporarily end up
in the wrong place.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92504
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 0ca81a2840f77855bbad1b9f172c545c4dc9e6a4 upstream.
ib_send_cm_sidr_rep could sometimes erase the node from the sidr
(depending on errors in the process). Since ib_send_cm_sidr_rep is
called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv
could be either erased from the rb_tree twice or not erased at all.
Fixing that by making sure it's erased only once before freeing
cm_id_priv.
Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
Signed-off-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 97aff2c03a1e4d343266adadb52313613efb027f upstream.
There are 24 EQ registers not 25, I suspect this bug came about because
the registers start at EQ1 not zero. The bug is relatively harmless as
the extra register written is an unused one.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 3fc89adb9fa4beff31374a4bf50b3d099d88ae83 upstream.
Currently a number of Crypto API operations may fail when a signal
occurs. This causes nasty problems as the caller of those operations
are often not in a good position to restart the operation.
In fact there is currently no need for those operations to be
interrupted by user signals at all. All we need is for them to
be killable.
This patch replaces the relevant calls of signal_pending with
fatal_signal_pending, and wait_for_completion_interruptible with
wait_for_completion_killable, respectively.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[bwh: Backported to 3.2: drop change to crypto_user_skcipher_alg(), which
we don't have]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit fd7cd061adcf5f7503515ba52b6a724642a839c8 upstream.
We received several reports of systems rebooting and powering on
after an attempted shutdown. Testing showed that setting
XHCI_SPURIOUS_WAKEUP quirk in addition to the XHCI_SPURIOUS_REBOOT
quirk allowed the system to shutdown as expected for LynxPoint-LP
xHCI controllers. Set the quirk back.
Note that the quirk was originally introduced for LynxPoint and
LynxPoint-LP just for this same reason. See:
commit 638298dc66ea ("xhci: Fix spurious wakeups after S5 on Haswell")
It was later limited to only concern HP machines as it caused
regression on some machines, see both bug and commit:
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66171
commit 6962d914f317 ("xhci: Limit the spurious wakeup fix only to HP machines")
Later it was discovered that the powering on after shutdown
was limited to LynxPoint-LP (Haswell-ULT) and that some non-LP HP
machine suffered from spontaneous resume from S3 (which should
not be related to the SPURIOUS_WAKEUP quirk at all). An attempt
to fix this then removed the SPURIOUS_WAKEUP flag usage completely.
commit b45abacde3d5 ("xhci: no switching back on non-ULT Haswell")
Current understanding is that LynxPoint-LP (Haswell ULT) machines
need the SPURIOUS_WAKEUP quirk, otherwise they will restart, and
plain Lynxpoint (Haswell) machines may _not_ have the quirk
set otherwise they again will restart.
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Oliver Neukum <oneukum@suse.com>
[Added more history to commit message -Mathias]
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commits c09ec25d3684cad74d851c0f028a495999591279 and
0a939993bff117d3657108ca13b011fc0378aedb upstream.
The same issue like with Panther Point chipsets. If the USB ports are
switched to xHCI on shutdown, the xHCI host will send a spurious interrupt,
which will wake the system. Some BIOS have work around for this, but not all.
One example is Compulab's mini-desktop, the Intense-PC2.
The bug can be avoided if the USB ports are switched back to EHCI on
shutdown.
This patch should be backported to stable kernels as old as 3.12,
that contain the commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
"xhci: Fix spurious wakeups after S5 on Haswell"
Signed-off-by: Denis Turischev <denis@compulab.co.il>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patch "xhci: Switch Intel Lynx Point ports to EHCI on shutdown."
commit c09ec25d3684cad74d851c0f028a495999591279 is not fully correct
It switches both Lynx Point and Lynx Point-LP ports to EHCI on shutdown.
On some Lynx Point machines it causes spurious interrupt,
which wake the system: bugzilla.kernel.org/show_bug.cgi?id=76291
On Lynx Point-LP on the contrary switching ports to EHCI seems to be
necessary to fix these spurious interrupts.
Signed-off-by: Denis Turischev <denis@compulab.co.il>
Reported-by: Wulf Richartz <wulf.richartz@gmail.com>
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Combined the above commits and backported to 3.2: adjust context to
apply after "xhci: Limit the spurious wakeup fix only to HP machines" and
"xhci: no switching back on non-ULT Haswell"]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 3b4739b8951d650becbcd855d7d6f18ac98a9a85 upstream.
If a host fails to wake up a isochronous SuperSpeed device from U1/U2
in time for a isoch transfer it will generate a "No ping response error"
Host will then move to the next transfer descriptor.
Handle this case in the same way as missed service errors, tag the
current TD as skipped and handle it on the next transfer event.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e210c422b6fdd2dc123bedc588f399aefd8bf9de upstream.
If the difference is big enough between the bytes asked and received
in a bulk transfer we can get a short transfer event pointing to a TRB in
the middle of the TD. We don't want to handle the TD yet as we will anyway
receive a new event for the last TRB in the TD.
Hold off from finishing the TD and removing it from the list until we
receive an event for the last TRB in the TD
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ba2374fd2bf379f933773811fdb06cb6a5445f41 upstream.
In preparation for the installation of a large page, any small page
tables that may still exist in the target IOV address range are
removed. However, if a scatter/gather list entry is large enough to
fit more than one large page, the address space for any subsequent
large pages is not cleared of conflicting small page tables.
This can cause legitimate mapping requests to fail with errors of the
form below, potentially followed by a series of IOMMU faults:
ERROR: DMA PTE for vPFN 0xfde00 already set (to 7f83a4003 not 7e9e00083)
In this example, a 4MiB scatter/gather list entry resulted in the
successful installation of a large page @ vPFN 0xfdc00, followed by
a failed attempt to install another large page @ vPFN 0xfde00, due to
the presence of a pointer to a small page table @ 0x7f83a4000.
To address this problem, compute the number of large pages that fit
into a given scatter/gather list entry, and use it to derive the
last vPFN covered by the large page(s).
Signed-off-by: Christian Zander <christian@nervanasys.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
[bwh: Backported to 3.2:
- Add the lvl_pages variable, added by an earlier commit upstream
- Also change arguments to dma_pte_clear_range(), which is called by
dma_pte_free_pagetable() upstream]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8996eafdcbad149ac0f772fb1649fbb75c482a6a upstream.
Unlike shash algorithms, ahash drivers must implement export
and import as their descriptors may contain hardware state and
cannot be exported as is. Unfortunately some ahash drivers did
not provide them and end up causing crashes with algif_hash.
This patch adds a check to prevent these drivers from registering
ahash algorithms until they are fixed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e8d65a8d985271a102f07c7456da5b86c19ffe16 upstream.
Add the appropriate quirk to indicate the Lenovo G50-80 has a stereo
mic input where one channel has reverse polarity.
Alsa-info available at:
https://launchpadlibrarian.net/220846272/AlsaInfo.txt
BugLink: https://bugs.launchpad.net/bugs/1504778
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit a54c8f0f2d7df525ff997e2afe71866a1a013064 upstream.
xen-blkfront will crash if the check to talk_to_blkback()
in blkback_changed()(XenbusStateInitWait) returns an error.
The driver data is freed and info is set to NULL. Later during
the close process via talk_to_blkback's call to xenbus_dev_fatal()
the null pointer is passed to and dereference in blkfront_closing.
Signed-off-by: Cathy Avery <cathy.avery@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 15e3d5a285ab9283136dba34bbf72886d9146706 upstream.
3w controller don't dma map small single SGL entry commands but instead
bounce buffer them. Add a helper to identify these commands and don't
call scsi_dma_unmap for them.
Based on an earlier patch from James Bottomley.
Fixes: 118c85 ("3w-9xxx: fix command completion race")
Reported-by: Tóth Attila <atoth@atoth.sote.hu>
Tested-by: Tóth Attila <atoth@atoth.sote.hu>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 upstream.
So the problem this patch is trying to address is as follows:
CPU0 CPU1
context_switch(A, B)
ttwu(A)
LOCK A->pi_lock
A->on_cpu == 0
finish_task_switch(A)
prev_state = A->state <-.
WMB |
A->on_cpu = 0; |
UNLOCK rq0->lock |
| context_switch(C, A)
`-- A->state = TASK_DEAD
prev_state == TASK_DEAD
put_task_struct(A)
context_switch(A, C)
finish_task_switch(A)
A->state == TASK_DEAD
put_task_struct(A)
The argument being that the WMB will allow the load of A->state on CPU0
to cross over and observe CPU1's store of A->state, which will then
result in a double-drop and use-after-free.
Now the comment states (and this was true once upon a long time ago)
that we need to observe A->state while holding rq->lock because that
will order us against the wakeup; however the wakeup will not in fact
acquire (that) rq->lock; it takes A->pi_lock these days.
We can obviously fix this by upgrading the WMB to an MB, but that is
expensive, so we'd rather avoid that.
The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
which avoids the MB on some archs, but not important ones like ARM.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: manfred@colorfullife.com
Cc: will.deacon@arm.com
Fixes: e4a52bcb9a18 ("sched: Remove rq->lock from the first half of ttwu()")
Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2:
- Adjust filename
- As smp_store_release() is not defined, use smp_mb()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 225db5762dc1a35b26850477ffa06e5cd0097243 upstream.
When OSS emulation is loaded on ISA SB AWE32 chip, we get now kernel
warnings like:
WARNING: CPU: 0 PID: 2791 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x51/0x80()
sysfs: cannot create duplicate filename '/devices/isa/sbawe.0/sound/card0/seq-oss-0-0'
It's because both emux synth and opl3 drivers try to register their
OSS device object with the same static index number 0. This hasn't
been a big problem until the recent rewrite of device management code
(that exposes sysfs at the same time), but it's been an obvious bug.
This patch works around it just by using a different index number of
emux synth object. There can be a more elegant way to fix, but it's
enough for now, as this code won't be touched so often, in anyway.
Reported-and-tested-by: Michael Shell <list1@michaelshell.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 5bd166872d8f99f156fac191299d24f828bb2348 upstream.
The code to send the RX PN data (for each TID) to the firmware
has a devastating bug: it overwrites the data for TID 0 with
all the TID data, leaving the remaining TIDs zeroed. This will
allow replays to actually be accepted by the firmware, which
could allow waking up the system.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e6740165b8f7f06d8caee0fceab3fb9d790a6fed upstream.
Since commit 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release"),
pppoe_release() calls dev_put(po->pppoe_dev) if sk is in the
PPPOX_ZOMBIE state. But pppoe_flush_dev() can set sk->sk_state to
PPPOX_ZOMBIE _and_ reset po->pppoe_dev to NULL. This leads to the
following oops:
[ 570.140800] BUG: unable to handle kernel NULL pointer dereference at 00000000000004e0
[ 570.142931] IP: [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
[ 570.144601] PGD 3d119067 PUD 3dbc1067 PMD 0
[ 570.144601] Oops: 0000 [#1] SMP
[ 570.144601] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc loop crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper acpi_cpufreq evdev serio_raw processor button ext4 crc16 mbcache jbd2 virtio_net virtio_blk virtio_pci virtio_ring virtio
[ 570.144601] CPU: 1 PID: 15738 Comm: ppp-apitest Not tainted 4.2.0 #1
[ 570.144601] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[ 570.144601] task: ffff88003d30d600 ti: ffff880036b60000 task.ti: ffff880036b60000
[ 570.144601] RIP: 0010:[<ffffffffa018c701>] [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
[ 570.144601] RSP: 0018:ffff880036b63e08 EFLAGS: 00010202
[ 570.144601] RAX: 0000000000000000 RBX: ffff880034340000 RCX: 0000000000000206
[ 570.144601] RDX: 0000000000000006 RSI: ffff88003d30dd20 RDI: ffff88003d30dd20
[ 570.144601] RBP: ffff880036b63e28 R08: 0000000000000001 R09: 0000000000000000
[ 570.144601] R10: 00007ffee9b50420 R11: ffff880034340078 R12: ffff8800387ec780
[ 570.144601] R13: ffff8800387ec7b0 R14: ffff88003e222aa0 R15: ffff8800387ec7b0
[ 570.144601] FS: 00007f5672f48700(0000) GS:ffff88003fc80000(0000) knlGS:0000000000000000
[ 570.144601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 570.144601] CR2: 00000000000004e0 CR3: 0000000037f7e000 CR4: 00000000000406a0
[ 570.144601] Stack:
[ 570.144601] ffffffffa018f240 ffff8800387ec780 ffffffffa018f240 ffff8800387ec7b0
[ 570.144601] ffff880036b63e48 ffffffff812caabe ffff880039e4e000 0000000000000008
[ 570.144601] ffff880036b63e58 ffffffff812cabad ffff880036b63ea8 ffffffff811347f5
[ 570.144601] Call Trace:
[ 570.144601] [<ffffffff812caabe>] sock_release+0x1a/0x75
[ 570.144601] [<ffffffff812cabad>] sock_close+0xd/0x11
[ 570.144601] [<ffffffff811347f5>] __fput+0xff/0x1a5
[ 570.144601] [<ffffffff811348cb>] ____fput+0x9/0xb
[ 570.144601] [<ffffffff81056682>] task_work_run+0x66/0x90
[ 570.144601] [<ffffffff8100189e>] prepare_exit_to_usermode+0x8c/0xa7
[ 570.144601] [<ffffffff81001a26>] syscall_return_slowpath+0x16d/0x19b
[ 570.144601] [<ffffffff813babb1>] int_ret_from_sys_call+0x25/0x9f
[ 570.144601] Code: 48 8b 83 c8 01 00 00 a8 01 74 12 48 89 df e8 8b 27 14 e1 b8 f7 ff ff ff e9 b7 00 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 a8 04 00 00 <48> 8b 80 e0 04 00 00 65 ff 08 48 c7 83 a8 04 00 00 00 00 00 00
[ 570.144601] RIP [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
[ 570.144601] RSP <ffff880036b63e08>
[ 570.144601] CR2: 00000000000004e0
[ 570.200518] ---[ end trace 46956baf17349563 ]---
pppoe_flush_dev() has no reason to override sk->sk_state with
PPPOX_ZOMBIE. pppox_unbind_sock() already sets sk->sk_state to
PPPOX_DEAD, which is the correct state given that sk is unbound and
po->pppoe_dev is NULL.
Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
Tested-by: Oleksii Berezhniak <core@irc.lg.ua>
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 0c55627167870255158db1cde0d28366f91c8872 upstream.
This is mostly a hardening fix, given that write-only access to other
users' ttys is usually only given through setgid tty executables.
Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2:
- __proc_set_tty() also takes a task_struct pointer]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e81107d4c6bd098878af9796b24edc8d4a9524fd upstream.
My colleague ran into a program stall on a x86_64 server, where
n_tty_read() was waiting for data even if there was data in the buffer
in the pty. kernel stack for the stuck process looks like below.
#0 [ffff88303d107b58] __schedule at ffffffff815c4b20
#1 [ffff88303d107bd0] schedule at ffffffff815c513e
#2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
#3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
#4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
#5 [ffff88303d107dd0] tty_read at ffffffff81368013
#6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
#7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
#8 [ffff88303d107f00] sys_read at ffffffff811a4306
#9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
There seems to be two problems causing this issue.
First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
updates ldata->commit_head using smp_store_release() and then checks
the wait queue using waitqueue_active(). However, since there is no
memory barrier, __receive_buf() could return without calling
wake_up_interactive_poll(), and at the same time, n_tty_read() could
start to wait in wait_woken() as in the following chart.
__receive_buf() n_tty_read()
------------------------------------------------------------------------
if (waitqueue_active(&tty->read_wait))
/* Memory operations issued after the
RELEASE may be completed before the
RELEASE operation has completed */
add_wait_queue(&tty->read_wait, &wait);
...
if (!input_available_p(tty, 0)) {
smp_store_release(&ldata->commit_head,
ldata->read_head);
...
timeout = wait_woken(&wait,
TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------
The second problem is that n_tty_read() also lacks a memory barrier
call and could also cause __receive_buf() to return without calling
wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
as in the chart below.
__receive_buf() n_tty_read()
------------------------------------------------------------------------
spin_lock_irqsave(&q->lock, flags);
/* from add_wait_queue() */
...
if (!input_available_p(tty, 0)) {
/* Memory operations issued after the
RELEASE may be completed before the
RELEASE operation has completed */
smp_store_release(&ldata->commit_head,
ldata->read_head);
if (waitqueue_active(&tty->read_wait))
__add_wait_queue(q, wait);
spin_unlock_irqrestore(&q->lock,flags);
/* from add_wait_queue() */
...
timeout = wait_woken(&wait,
TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------
There are also other places in drivers/tty/n_tty.c which have similar
calls to waitqueue_active(), so instead of adding many memory barrier
calls, this patch simply removes the call to waitqueue_active(),
leaving just wake_up*() behind.
This fixes both problems because, even though the memory access before
or after the spinlocks in both wake_up*() and add_wait_queue() can
sneak into the critical section, it cannot go past it and the critical
section assures that they will be serialized (please see "INTER-CPU
ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
better explanation). Moreover, the resulting code is much simpler.
Latency measurement using a ping-pong test over a pty doesn't show any
visible performance drop.
Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2:
- Use wake_up_interruptible(), not wake_up_interruptible_poll()
- There are only two spurious uses of waitqueue_active() to remove]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 72194739f54607bbf8cfded159627a2015381557 upstream.
Add a device quirk for the Logitech PTZ Pro Camera and its sibling the
ConferenceCam CC3000e Camera.
This fixes the failed camera enumeration on some boot, particularly on
machines with fast CPU.
Tested by connecting a Logitech PTZ Pro Camera to a machine with a
Haswell Core i7-4600U CPU @ 2.10GHz, and doing thousands of reboot cycles
while recording the kernel logs and taking camera picture after each boot.
Before the patch, more than 7% of the boots show some enumeration transfer
failures and in a few of them, the kernel is giving up before actually
enumerating the webcam. After the patch, the enumeration has been correct
on every reboot.
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8484bf2981b3d006426ac052a3642c9ce1d8d980 upstream.
These two headphones need a reset-resume quirk to properly resume to
original volume level.
Signed-off-by: Yao-Wen Mao <yaowen@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit eda7d0f38aaf50dbb2a2de15e8db386c4f6f65fc upstream.
"num_read" is in byte units but we are write u16s so we end up write
twice as much as intended.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 67dfae0cd72fec5cd158b6e5fb1647b7dbe0834c upstream.
This patch fixes one cases where abs() was being used with 64-bit
nanosecond values, where the result may be capped at 32-bits.
This potentially could cause watchdog false negatives on 32-bit
systems, so this patch addresses the issue by using abs64().
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1442279124-7309-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 66eefe5de11db1e0d8f2edc3880d50e7c36a9d43 upstream.
Calling e.g. blk_queue_max_hw_sectors() after calls to
disk_stack_limits() discards the settings determined by
disk_stack_limits().
So we need to make those calls first.
Fixes: 199dc6ed5179 ("md/raid0: update queue parameter in a safer location.")
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
[bwh: Backported to 3.2: the code being moved looks a little different]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 199dc6ed5179251fa6158a461499c24bdd99c836 upstream.
When a (e.g.) RAID5 array is reshaped to RAID0, the updating
of queue parameters (e.g. max number of sectors per bio) is
done in the wrong place.
It should be part of ->run, but it is actually part of ->takeover.
This means it happens before level_store() calls:
blk_set_stacking_limits(&mddev->queue->limits);
and so it ineffective. This can lead to errors from underlying
devices.
So move all the relevant settings out of create_stripe_zones()
and into raid0_run().
As this can lead to a bug-on it is suitable for any -stable
kernel which supports reshape to RAID0. So 2.6.35 or later.
As the bug has been present for five years there is no urgency,
so no need to rush into -stable.
Fixes: 9af204cf720c ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover")
Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
[bwh: Backported to 3.2:
- md has no discard or write-same support
- md is not used by dm-raid so mddev->queue is never null
- Open-code rdev_for_each()
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 646200a041203f440fb6fcf9cacd9efeda9de74c upstream.
The error paths in set_file_size for cifs and smb3 are incorrect.
In the unlikely event that a server did not support set file info
of the file size, the code incorrectly falls back to trying SMBWriteX
(note that only the original core SMB Write, used for example by DOS,
can set the file size this way - this actually does not work for the more
recent SMBWriteX). The idea was since the old DOS SMB Write could set
the file size if you write zero bytes at that offset then use that if
server rejects the normal set file info call.
Fortunately the SMBWriteX will never be sent on the wire (except when
file size is zero) since the length and offset fields were reversed
in the two places in this function that call SMBWriteX causing
the fall back path to return an error. It is also important to never call
an SMB request from an SMB2/sMB3 session (which theoretically would
be possible, and can cause a brief session drop, although the client
recovers) so this should be fixed. In practice this path does not happen
with modern servers but the error fall back to SMBWriteX is clearly wrong.
Removing the calls to SMBWriteX in the error paths in cifs_set_file_size
Pointed out by PaX/grsecurity team
Signed-off-by: Steve French <steve.french@primarydata.com>
Reported-by: PaX Team <pageexec@freemail.hu>
CC: Emese Revfy <re.emese@gmail.com>
CC: Brad Spengler <spender@grsecurity.net>
[bwh: Backported to 3.2: deleted code looks slightly different]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 2f84a8990ebbe235c59716896e017c6b2ca1200f upstream.
SunDong reported the following on
https://bugzilla.kernel.org/show_bug.cgi?id=103841
I think I find a linux bug, I have the test cases is constructed. I
can stable recurring problems in fedora22(4.0.4) kernel version,
arch for x86_64. I construct transparent huge page, when the parent
and child process with MAP_SHARE, MAP_PRIVATE way to access the same
huge page area, it has the opportunity to lead to huge page copy on
write failure, and then it will munmap the child corresponding mmap
area, but then the child mmap area with VM_MAYSHARE attributes, child
process munmap this area can trigger VM_BUG_ON in set_vma_resv_flags
functions (vma - > vm_flags & VM_MAYSHARE).
There were a number of problems with the report (e.g. it's hugetlbfs that
triggers this, not transparent huge pages) but it was fundamentally
correct in that a VM_BUG_ON in set_vma_resv_flags() can be triggered that
looks like this
vma ffff8804651fd0d0 start 00007fc474e00000 end 00007fc475e00000
next ffff8804651fd018 prev ffff8804651fd188 mm ffff88046b1b1800
prot 8000000000000027 anon_vma (null) vm_ops ffffffff8182a7a0
pgoff 0 file ffff88106bdb9800 private_data (null)
flags: 0x84400fb(read|write|shared|mayread|maywrite|mayexec|mayshare|dontexpand|hugetlb)
------------
kernel BUG at mm/hugetlb.c:462!
SMP
Modules linked in: xt_pkttype xt_LOG xt_limit [..]
CPU: 38 PID: 26839 Comm: map Not tainted 4.0.4-default #1
Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
set_vma_resv_flags+0x2d/0x30
The VM_BUG_ON is correct because private and shared mappings have
different reservation accounting but the warning clearly shows that the
VMA is shared.
When a private COW fails to allocate a new page then only the process
that created the VMA gets the page -- all the children unmap the page.
If the children access that data in the future then they get killed.
The problem is that the same file is mapped shared and private. During
the COW, the allocation fails, the VMAs are traversed to unmap the other
private pages but a shared VMA is found and the bug is triggered. This
patch identifies such VMAs and skips them.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: SunDong <sund_sky@126.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 95c2b17534654829db428f11bcf4297c059a2a7e upstream.
Per-IRQ directories in procfs are created only when a handler is first
added to the irqdesc, not when the irqdesc is created. In the case of
a shared IRQ, multiple tasks can race to create a directory. This
race condition seems to have been present forever, but is easier to
hit with async probing.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Link: http://lkml.kernel.org/r/1443266636.2004.2.camel@decadent.org.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|