lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2016-04-12	Linux 4.5.1v4.5.1	Greg Kroah-Hartman

2016-04-12	perf/x86/intel: Fix PEBS data source interpretation on Nehalem/Westmere	Andi Kleen
	commit e17dc65328057c00db7e1bfea249c8771a78b30b upstream. Jiri reported some time ago that some entries in the PEBS data source table in perf do not agree with the SDM. We investigated and the bits changed for Sandy Bridge, but the SDM was not updated. perf already implements the bits correctly for Sandy Bridge and later. This patch patches it up for Nehalem and Westmere. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1456871124-15985-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	perf/x86/intel: Use PAGE_SIZE for PEBS buffer size on Core2	Jiri Olsa
	commit e72daf3f4d764c47fb71c9bdc7f9c54a503825b1 upstream. Using PAGE_SIZE buffers makes the WRMSR to PERF_GLOBAL_CTRL in intel_pmu_enable_all() mysteriously hang on Core2. As a workaround, we don't do this. The hard lockup is easily triggered by running 'perf test attr' repeatedly. Most of the time it gets stuck on sample session with small periods. # perf test attr -vv 14: struct perf_event_attr setup : --- start --- ... 'PERF_TEST_ATTR=/tmp/tmpuEKz3B /usr/bin/perf record -o /tmp/tmpuEKz3B/perf.data -c 123 kill >/dev/null 2>&1' ret 1 Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20160301190352.GA8355@krava.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	perf/x86/intel: Fix PEBS warning by only restoring active PMU in pmi	Kan Liang
	commit c3d266c8a9838cc141b69548bc3b1b18808ae8c4 upstream. This patch tries to fix a PEBS warning found in my stress test. The following perf command can easily trigger the pebs warning or spurious NMI error on Skylake/Broadwell/Haswell platforms: sudo perf record -e 'cpu/umask=0x04,event=0xc4/pp,cycles,branches,ref-cycles,cache-misses,cache-references' --call-graph fp -b -c1000 -a Also the NMI watchdog must be enabled. For this case, the events number is larger than counter number. So perf has to do multiplexing. In perf_mux_hrtimer_handler, it does perf_pmu_disable(), schedule out old events, rotate_ctx, schedule in new events and finally perf_pmu_enable(). If the old events include precise event, the MSR_IA32_PEBS_ENABLE should be cleared when perf_pmu_disable(). The MSR_IA32_PEBS_ENABLE should keep 0 until the perf_pmu_enable() is called and the new event is precise event. However, there is a corner case which could restore PEBS_ENABLE to stale value during the above period. In perf_pmu_disable(), GLOBAL_CTRL will be set to 0 to stop overflow and followed PMI. But there may be pending PMI from an earlier overflow, which cannot be stopped. So even GLOBAL_CTRL is cleared, the kernel still be possible to get PMI. At the end of the PMI handler, __intel_pmu_enable_all() will be called, which will restore the stale values if old events haven't scheduled out. Once the stale pebs value is set, it's impossible to be corrected if the new events are non-precise. Because the pebs_enabled will be set to 0. x86_pmu.enable_all() will ignore the MSR_IA32_PEBS_ENABLE setting. As a result, the following NMI with stale PEBS_ENABLE trigger pebs warning. The pending PMI after enabled=0 will become harmless if the NMI handler does not change the state. This patch checks cpuc->enabled in pmi and only restore the state when PMU is active. Here is the dump: Call Trace: <NMI> [<ffffffff813c3a2e>] dump_stack+0x63/0x85 [<ffffffff810a46f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a483a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8100fe2e>] intel_pmu_drain_pebs_nhm+0x2be/0x320 [<ffffffff8100caa9>] intel_pmu_handle_irq+0x279/0x460 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff811f290d>] ? vunmap_page_range+0x20d/0x330 [<ffffffff811f2f11>] ? unmap_kernel_range_noflush+0x11/0x20 [<ffffffff8148379f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0 [<ffffffff814839c8>] ? ghes_read_estatus+0x98/0x170 [<ffffffff81005a7d>] perf_event_nmi_handler+0x2d/0x50 [<ffffffff810310b9>] nmi_handle+0x69/0x120 [<ffffffff810316f6>] default_do_nmi+0xe6/0x100 [<ffffffff810317f2>] do_nmi+0xe2/0x130 [<ffffffff817aea71>] end_repeat_nmi+0x1a/0x1e [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 <<EOE>> <IRQ> [<ffffffff81006df8>] ? x86_perf_event_set_period+0xd8/0x180 [<ffffffff81006eec>] x86_pmu_start+0x4c/0x100 [<ffffffff8100722d>] x86_pmu_enable+0x28d/0x300 [<ffffffff811994d7>] perf_pmu_enable.part.81+0x7/0x10 [<ffffffff8119cb70>] perf_mux_hrtimer_handler+0x200/0x280 [<ffffffff8119c970>] ? __perf_install_in_context+0xc0/0xc0 [<ffffffff8110f92d>] __hrtimer_run_queues+0xfd/0x280 [<ffffffff811100d8>] hrtimer_interrupt+0xa8/0x190 [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff81051bd8>] local_apic_timer_interrupt+0x38/0x60 [<ffffffff817af01d>] smp_apic_timer_interrupt+0x3d/0x50 [<ffffffff817ad15c>] apic_timer_interrupt+0x8c/0xa0 <EOI> [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff81123de5>] ? smp_call_function_single+0xd5/0x130 [<ffffffff81123ddb>] ? smp_call_function_single+0xcb/0x130 [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff8119765a>] event_function_call+0x10a/0x120 [<ffffffff8119c660>] ? ctx_resched+0x90/0x90 [<ffffffff811971e0>] ? cpu_clock_event_read+0x30/0x30 [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60 [<ffffffff8119772b>] _perf_event_enable+0x5b/0x70 [<ffffffff81197388>] perf_event_for_each_child+0x38/0xa0 [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60 [<ffffffff811a0ffd>] perf_ioctl+0x12d/0x3c0 [<ffffffff8134d855>] ? selinux_file_ioctl+0x95/0x1e0 [<ffffffff8124a3a1>] do_vfs_ioctl+0xa1/0x5a0 [<ffffffff81036d29>] ? sched_clock+0x9/0x10 [<ffffffff8124a919>] SyS_ioctl+0x79/0x90 [<ffffffff817ac4b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4 ---[ end trace aef202839fe9a71d ]--- Uhhuh. NMI received for unknown reason 2d on CPU 2. Do you have a strange power saving mode enabled? Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1457046448-6184-1-git-send-email-kan.liang@intel.com [ Fixed various typos and other small details. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	perf/x86/intel/uncore: Remove SBOX support for BDX-DE	Kan Liang
	commit 6cb2f1d9af5b0f0afdd4e689d969df4b5c76a4c2 upstream. BDX-DE and BDX-EP share the same uncore code path. But there is no sbox in BDX-DE. This patch remove SBOX support for BDX-DE. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <tonyb@cybernetics.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Battersby <tonyb@cybernetics.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/37D7C6CF3E00A74B8858931C1DB2F0770589D336@SHSMSX103.ccr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	perf/x86/pebs: Add workaround for broken OVFL status on HSW+	Stephane Eranian
	commit 8077eca079a212f26419c57226f28696b7100683 upstream. This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on Haswell, Broadwell and Skylake processors when using PEBS. The SDM stipulates that when the PEBS iterrupt threshold is crossed, an interrupt is posted and the kernel is interrupted. The kernel will find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to drain. But the bits corresponding to the actual counters should NOT be set. The kernel follows the SDM and assumes that all PEBS events are processed in the drain_pebs() callback. The kernel then checks for remaining overflows on any other (non-PEBS) events and processes these in the for_each_bit_set(&status) loop. As it turns out, under certain conditions on HSW and later processors, on PEBS buffer interrupt, bit 62 is set but the counter bits may be set as well. In that case, the kernel drains PEBS and generates SAMPLES with the EXACT tag, then it processes the counter bits, and generates normal (non-EXACT) SAMPLES. I ran into this problem by trying to understand why on HSW sampling on a PEBS event was sometimes returning SAMPLES without the EXACT tag. This should not happen on user level code because HSW has the eventing_ip which always point to the instruction that caused the event. The workaround in this patch simply ensures that the bits for the counters used for PEBS events are cleared after the PEBS buffer has been drained. With this fix 100% of the PEBS samples on my user code report the EXACT tag. Before: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D \| fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is missing After: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D \| fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is set The problem tends to appear more often when multiple PEBS events are used. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	sched/cputime: Fix steal time accounting vs. CPU hotplug	Thomas Gleixner
	commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream. On CPU hotplug the steal time accounting can keep a stale rq->prev_steal_time value over CPU down and up. So after the CPU comes up again the delta calculation in steal_account_process_tick() wreckages itself due to the unsigned math: u64 steal = paravirt_steal_clock(smp_processor_id()); steal -= this_rq()->prev_steal_time; So if steal is smaller than rq->prev_steal_time we end up with an insane large value which then gets added to rq->prev_steal_time, resulting in a permanent wreckage of the accounting. As a consequence the per CPU stats in /proc/stat become stale. Nice trick to tell the world how idle the system is (100%) while the CPU is 100% busy running tasks. Though we prefer realistic numbers. None of the accounting values which use a previous value to account for fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity check for prev_irq_time and prev_steal_time_rq, but that sanity check solely deals with clock warps and limits the /proc/stat visible wreckage. The prev_time values are still wrong. Solution is simple: Reset rq->prev_*_time when the CPU is plugged in again. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for stolen time" Fixes: commit aa483808516c "sched: Remove irq time from available CPU power" Fixes: commit e6e6685accfa "KVM guest: Steal time accounting" Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	scsi_common: do not clobber fixed sense information	Hannes Reinecke
	commit ba08311647892cc7912de74525fd78416caf544a upstream. For fixed sense the information field is 32 bits, to we need to truncate the information field to avoid clobbering the sense code. Fixes: a1524f226a02 ("libata-eh: Set 'information' field for autosense") Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	PM / sleep: Clear pm_suspend_global_flags upon hibernate	Lukas Wunner
	commit 276142730c39c9839465a36a90e5674a8c34e839 upstream. When suspending to RAM, waking up and later suspending to disk, we gratuitously runtime resume devices after the thaw phase. This does not occur if we always suspend to RAM or always to disk. pm_complete_with_resume_check(), which gets called from pci_pm_complete() among others, schedules a runtime resume if PM_SUSPEND_FLAG_FW_RESUME is set. The flag is set during a suspend-to-RAM cycle. It is cleared at the beginning of the suspend-to-RAM cycle but not afterwards and it is not cleared during a suspend-to-disk cycle at all. Fix it. Fixes: ef25ba047601 (PM / sleep: Add flags to indicate platform firmware involvement) Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled	Len Brown
	commit d70e28f57e14a481977436695b0c9ba165472431 upstream. Some SKL-H configurations require "intel_idle.max_cstate=7" to boot. While that is an effective workaround, it disables C10. This patch detects the problematic configuration, and disables C8 and C9, keeping C10 enabled. Note that enabling SGX in BIOS SETUP can also prevent this issue, if the system BIOS provides that option. https://bugzilla.kernel.org/show_bug.cgi?id=109081 "Freezes with Intel i7 6700HQ (Skylake), unless intel_idle.max_cstate=7" Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mtd: onenand: fix deadlock in onenand_block_markbad	Aaro Koskinen
	commit 5e64c29e98bfbba1b527b0a164f9493f3db9e8cb upstream. Commit 5942ddbc500d ("mtd: introduce mtd_block_markbad interface") incorrectly changed onenand_block_markbad() to call mtd_block_markbad instead of onenand_chip's block_markbad function. As a result the function will now recurse and deadlock. Fix by reverting the change. Fixes: 5942ddbc500d ("mtd: introduce mtd_block_markbad interface") Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mm/page_alloc: prevent merging between isolated and other pageblocks	Vlastimil Babka
	commit d9dddbf556674bf125ecd925b24e43a5cf2a568a upstream. Hanjun Guo has reported that a CMA stress test causes broken accounting of CMA and free pages: > Before the test, I got: > -bash-4.3# cat /proc/meminfo \| grep Cma > CmaTotal: 204800 kB > CmaFree: 195044 kB > > > After running the test: > -bash-4.3# cat /proc/meminfo \| grep Cma > CmaTotal: 204800 kB > CmaFree: 6602584 kB > > So the freed CMA memory is more than total.. > > Also the the MemFree is more than mem total: > > -bash-4.3# cat /proc/meminfo > MemTotal: 16342016 kB > MemFree: 22367268 kB > MemAvailable: 22370528 kB Laura Abbott has confirmed the issue and suspected the freepage accounting rewrite around 3.18/4.0 by Joonsoo Kim. Joonsoo had a theory that this is caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA pageblocks: > CMA isolates MAX_ORDER aligned blocks, but, during the process, > partialy isolated block exists. If MAX_ORDER is 11 and > pageblock_order is 9, two pageblocks make up MAX_ORDER > aligned block and I can think following scenario because pageblock > (un)isolation would be done one by one. > > (each character means one pageblock. 'C', 'I' means MIGRATE_CMA, > MIGRATE_ISOLATE, respectively. > > CC -> IC -> II (Isolation) > II -> CI -> CC (Un-isolation) > > If some pages are freed at this intermediate state such as IC or CI, > that page could be merged to the other page that is resident on > different type of pageblock and it will cause wrong freepage count. This was supposed to be prevented by CMA operating on MAX_ORDER blocks, but since it doesn't hold the zone->lock between pageblocks, a race window does exist. It's also likely that unexpected merging can occur between MIGRATE_ISOLATE and non-CMA pageblocks. This should be prevented in __free_one_page() since commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock"). However, we only check the migratetype of the pageblock where buddy merging has been initiated, not the migratetype of the buddy pageblock (or group of pageblocks) which can be MIGRATE_ISOLATE. Joonsoo has suggested checking for buddy migratetype as part of page_is_buddy(), but that would add extra checks in allocator hotpath and bloat-o-meter has shown significant code bloat (the function is inline). This patch reduces the bloat at some expense of more complicated code. The buddy-merging while-loop in __free_one_page() is initially bounded to pageblock_border and without any migratetype checks. The checks are placed outside, bumping the max_order if merging is allowed, and returning to the while-loop with a statement which can't be possibly considered harmful. This fixes the accounting bug and also removes the arguably weird state in the original commit 3c605096d315 where buddies could be left unmerged. Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Link: https://lkml.org/lkml/2016/3/2/280 Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Hanjun Guo <guohanjun@huawei.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Debugged-by: Laura Abbott <labbott@redhat.com> Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list	Joseph Qi
	commit be12b299a83fc807bbaccd2bcb8ec50cbb0cb55c upstream. When master handles convert request, it queues ast first and then returns status. This may happen that the ast is sent before the request status because the above two messages are sent by two threads. And right after the ast is sent, if master down, it may trigger BUG in dlm_move_lockres_to_recovery_list in the requested node because ast handler moves it to grant list without clear lock->convert_pending. So remove BUG_ON statement and check if the ast is processed in dlmconvert_remote. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reported-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Tariq Saeed <tariq.x.saeed@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	ocfs2/dlm: fix race between convert and recovery	Joseph Qi
	commit ac7cf246dfdbec3d8fed296c7bf30e16f5099dac upstream. There is a race window between dlmconvert_remote and dlm_move_lockres_to_recovery_list, which will cause a lock with OCFS2_LOCK_BUSY in grant list, thus system hangs. dlmconvert_remote { spin_lock(&res->spinlock); list_move_tail(&lock->list, &res->converting); lock->convert_pending = 1; spin_unlock(&res->spinlock); status = dlm_send_remote_convert_request(); >>>>>> race window, master has queued ast and return DLM_NORMAL, and then down before sending ast. this node detects master down and calls dlm_move_lockres_to_recovery_list, which will revert the lock to grant list. Then OCFS2_LOCK_BUSY won't be cleared as new master won't send ast any more because it thinks already be authorized. spin_lock(&res->spinlock); lock->convert_pending = 0; if (status != DLM_NORMAL) dlm_revert_pending_convert(res, lock); spin_unlock(&res->spinlock); } In this case, check if res->state has DLM_LOCK_RES_RECOVERING bit set (res is still in recovering) or res master changed (new master has finished recovery), reset the status to DLM_RECOVERING, then it will retry convert. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reported-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Tariq Saeed <tariq.x.saeed@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	ocfs2: o2hb: fix double free bug	Junxiao Bi
	commit 9e13f1f9de1cb143fbae6f1170f26c8544b64cff upstream. This is a regression issue and caused the following kernel panic when do ocfs2 multiple test. BUG: unable to handle kernel paging request at 00000002000800c0 IP: [<ffffffff81192978>] kmem_cache_alloc+0x78/0x160 PGD 7bbe5067 PUD 0 Oops: 0000 [#1] SMP Modules linked in: ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront CPU: 2 PID: 4044 Comm: mpirun Not tainted 4.5.0-rc5-next-20160225 #1 Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014 task: ffff88007a521a80 ti: ffff88007aed0000 task.ti: ffff88007aed0000 RIP: 0010:[<ffffffff81192978>] [<ffffffff81192978>] kmem_cache_alloc+0x78/0x160 RSP: 0018:ffff88007aed3a48 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000001991 RDX: 0000000000001990 RSI: 00000000024000c0 RDI: 000000000001b330 RBP: ffff88007aed3a98 R08: ffff88007d29b330 R09: 00000002000800c0 R10: 0000000c51376d87 R11: ffff8800792cac38 R12: ffff88007cc30f00 R13: 00000000024000c0 R14: ffffffff811b053f R15: ffff88007aed3ce7 FS: 0000000000000000(0000) GS:ffff88007d280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000002000800c0 CR3: 000000007aeb2000 CR4: 00000000000406e0 Call Trace: __d_alloc+0x2f/0x1a0 d_alloc+0x17/0x80 lookup_dcache+0x8a/0xc0 path_openat+0x3c3/0x1210 do_filp_open+0x80/0xe0 do_sys_open+0x110/0x200 SyS_open+0x19/0x20 do_syscall_64+0x72/0x230 entry_SYSCALL64_slow_path+0x25/0x25 Code: 05 e6 77 e7 7e 4d 8b 08 49 8b 40 10 4d 85 c9 0f 84 dd 00 00 00 48 85 c0 0f 84 d4 00 00 00 49 63 44 24 20 49 8b 3c 24 48 8d 4a 01 <49> 8b 1c 01 4c 89 c8 65 48 0f c7 0f 0f 94 c0 3c 01 75 b6 49 63 RIP kmem_cache_alloc+0x78/0x160 CR2: 00000002000800c0 ---[ end trace 823969e602e4aaac ]--- Fixes: a4a1dfa4bb8b("ocfs2/cluster: fix memory leak in o2hb_region_release") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	Input: ati_remote2 - fix crashes on detecting device with invalid descriptor	Vladis Dronov
	commit 950336ba3e4a1ffd2ca60d29f6ef386dd2c7351d upstream. The ati_remote2 driver expects at least two interfaces with one endpoint each. If given malicious descriptor that specify one interface or no endpoints, it will crash in the probe function. Ensure there is at least two interfaces and one endpoint for each interface before using it. The full disclosure: http://seclists.org/bugtraq/2016/Mar/90 Reported-by: Ralf Spenneberg <ralf@spenneberg.net> Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	Input: ims-pcu - sanity check against missing interfaces	Oliver Neukum
	commit a0ad220c96692eda76b2e3fd7279f3dcd1d8a8ff upstream. A malicious device missing interface can make the driver oops. Add sanity checking. Signed-off-by: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	Input: synaptics - handle spurious release of trackstick buttons, again	Benjamin Tissoires
	commit 82be788c96ed5978d3cb4a00079e26b981a3df3f upstream. Looks like the fimware 8.2 still has the extra buttons spurious release bug. Link: https://bugzilla.kernel.org/show_bug.cgi?id=114321 Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	writeback, cgroup: fix use of the wrong bdi_writeback which mismatches the inode	Tejun Heo
	commit aaf2559332ba272671bb870464a99b909b29a3a1 upstream. When cgroup writeback is in use, there can be multiple wb's (bdi_writeback's) per bdi and an inode may switch among them dynamically. In a couple places, the wrong wb was used leading to performing operations on the wrong list under the wrong lock corrupting the io lists. * writeback_single_inode() was taking @wb parameter and used it to remove the inode from io lists if it becomes clean after writeback. The callers of this function were always passing in the root wb regardless of the actual wb that the inode was associated with, which could also change while writeback is in progress. Fix it by dropping the @wb parameter and using inode_to_wb_and_lock_list() to determine and lock the associated wb. * After writeback_sb_inodes() writes out an inode, it re-locks @wb and inode to remove it from or move it to the right io list. It assumes that the inode is still associated with @wb; however, the inode may have switched to another wb while writeback was in progress. Fix it by using inode_to_wb_and_lock_list() to determine and lock the associated wb after writeback is complete. As the function requires the original @wb->list_lock locked for the next iteration, in the unlikely case where the inode has changed association, switch the locks. Kudos to Tahsin for pinpointing these subtle breakages. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: d10c80955265 ("writeback: implement foreign cgroup inode bdi_writeback switching") Link: http://lkml.kernel.org/g/CAAeU0aMYeM_39Y2+PaRvyB1nqAPYZSNngJ1eBRmrxn7gKAt2Mg@mail.gmail.com Reported-and-diagnosed-by: Tahsin Erdogan <tahsin@google.com> Tested-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	writeback, cgroup: fix premature wb_put() in locked_inode_to_wb_and_lock_list()	Tejun Heo
	commit 614a4e3773148a31f58dc174bbf578ceb63510c2 upstream. locked_inode_to_wb_and_lock_list() wb_get()'s the wb associated with the target inode, unlocks inode, locks the wb's list_lock and verifies that the inode is still associated with the wb. To prevent the wb going away between dropping inode lock and acquiring list_lock, the wb is pinned while inode lock is held. The wb reference is put right after acquiring list_lock citing that the wb won't be dereferenced anymore. This isn't true. If the inode is still associated with the wb, the inode has reference and it's safe to return the wb; however, if inode has been switched, the wb still needs to be unlocked which is a dereference and can lead to use-after-free if it it races with wb destruction. Fix it by putting the reference after releasing list_lock. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 87e1d789bf55 ("writeback: implement [locked_]inode_to_wb_and_lock_list()") Tested-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	ACPI / PM: Runtime resume devices when waking from hibernate	Lukas Wunner
	commit fbda4b38fa3995aa0777fe9cbbdcb223c6292083 upstream. Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been reset by firmware") added a runtime resume for devices that were runtime suspended when the system entered suspend-to-RAM. Briefly, the motivation was to ensure that devices did not remain in a reset-power-on state after resume, potentially preventing deep SoC-wide low-power states from being entered on idle. Currently we're not doing the same when leaving suspend-to-disk and this asymmetry is a problem if drivers rely on the automatic resume triggered by pm_complete_with_resume_check(). Fix it. Fixes: 58a1fbbb2ee8 (PM / PCI / ACPI: Kick devices that might have been reset by firmware) Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	ARM: dts: at91: sama5d4 Xplained: don't disable hsmci regulator	Ludovic Desroches
	commit b02acd4e62602a6ab307da84388a16bf60106c48 upstream. If enabling the hsmci regulator on card detection, the board can reboot on sd card insertion. Keeping the regulator always enabled fixes this issue. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Fixes: 8d545f32bd77 ("ARM: at91/dt: sama5d4 xplained: add regulators for v(q)mmc1 supplies") Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	ARM: dts: at91: sama5d3 Xplained: don't disable hsmci regulator	Ludovic Desroches
	commit ae3fc8ea08e405682f1fa959f94b6e4126afbc1b upstream. If enabling the hsmci regulator on card detection, the board can reboot on sd card insertion. Keeping the regulator always enabled fixes this issue. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Fixes: 1b53e3416dd0 ("ARM: at91/dt: sama5d3 xplained: add fixed regulator for vmmc0") Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	nfsd: fix deadlock secinfo+readdir compound	J. Bruce Fields
	commit 2f6fc056e899bd0144a08da5cacaecbe8997cd74 upstream. nfsd_lookup_dentry exits with the parent filehandle locked. fh_put also unlocks if necessary (nfsd filehandle locking is probably too lenient), so it gets unlocked eventually, but if the following op in the compound needs to lock it again, we can deadlock. A fuzzer ran into this; normal clients don't send a secinfo followed by a readdir in the same compound. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	nfsd4: fix bad bounds checking	J. Bruce Fields
	commit 4aed9c46afb80164401143aa0fdcfe3798baa9d5 upstream. A number of spots in the xdr decoding follow a pattern like n = be32_to_cpup(p++); READ_BUF(n + 4); where n is a u32. The only bounds checking is done in READ_BUF itself, but since it's checking (n + 4), it won't catch cases where n is very large, (u32)(-4) or higher. I'm not sure exactly what the consequences are, but we've seen crashes soon after. Instead, just break these up into two READ_BUF()s. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	iser-target: Rework connection termination	Jenny Derzhavetz
	commit 6d1fba0c2cc7efe42fd761ecbba833ed0ea7b07e upstream. When we receive an event that triggers connection termination, we have a a couple of things we may want to do: 1. In case we are already terminating, bailout early 2. In case we are connected but not bound, disconnect and schedule a connection cleanup silently (don't reinstate) 3. In case we are connected and bound, disconnect and reinstate the connection This rework fixes a bug that was detected against a mis-behaved initiator which rejected our rdma_cm accept, in this stage the isert_conn is no bound and reinstate caused a bogus dereference. What's great about this is that we don't need the post_recv_buf_count anymore, so get rid of it. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	iser-target: Separate flows for np listeners and connections cma events	Jenny Derzhavetz
	commit f81bf458208ef6d12b2fc08091204e3859dcdba4 upstream. No need to restrict this check to specific events. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	iser-target: Add new state ISER_CONN_BOUND to isert_conn	Jenny Derzhavetz
	commit aea92980601f7ddfcb3c54caa53a43726314fe46 upstream. We need an indication that isert_conn->iscsi_conn binding has happened so we'll know not to invoke a connection reinstatement on an unbound connection which will lead to a bogus isert_conn->conn dereferece. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	iser-target: Fix identification of login rx descriptor type	Jenny Derzhavetz
	commit b89a7c25462b164db280abc3b05d4d9d888d40e9 upstream. Once connection request is accepted, one rx descriptor is posted to receive login request. This descriptor has rx type, but is outside the main pool of rx descriptors, and thus was mistreated as tx type. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	target: Fix target_release_cmd_kref shutdown comp leak	Himanshu Madhani
	commit 5e47f1985d7107331c3f64fb3ec83d66fd73577e upstream. This patch fixes an active I/O shutdown bug for fabric drivers using target_wait_for_sess_cmds(), where se_cmd descriptor shutdown would result in hung tasks waiting indefinitely for se_cmd->cmd_wait_comp to complete(). To address this bug, drop the incorrect list_del_init() usage in target_wait_for_sess_cmds() and always complete() during se_cmd target_release_cmd_kref() put, in order to let caller invoke the final fabric release callback into se_cmd->se_tfo->release_cmd() code. Reported-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Tested-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	clk: bcm2835: Fix setting of PLL divider clock rates	Eric Anholt
	commit 773b3966dd3cdaeb68e7f2edfe5656abac1dc411 upstream. Our dividers weren't being set successfully because CM_PASSWORD wasn't included in the register write. It looks easier to just compute the divider to write ourselves than to update clk-divider for the ability to OR in some arbitrary bits on write. Fixes about half of the video modes on my HDMI monitor (everything except 720x400). Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	clk: rockchip: add hclk_cpubus to the list of rk3188 critical clocks	Alexander Kochetkov
	commit e8b63288b37dbb8457b510c9d96f6006da4653f6 upstream. hclk_cpubus needs to keep running because it is needed for devices like the rom, i2s0 or spdif to be accessible via cpu. Without that all accesses to devices (readl/writel) return wrong data. So add it to the list of critical clocks. Fixes: 78eaf6095cc763c ("clk: rockchip: disable unused clocks") Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	clk: rockchip: rk3368: fix hdmi_cec gate-register	Heiko Stuebner
	commit fd0c0740fac17a014704ef89d8c8b1768711ca59 upstream. Fix a typo making the sclk_hdmi_cec access a wrong register to handle its gate. Fixes: 3536c97a52db ("clk: rockchip: add rk3368 clock controller") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: zhangqing <zhangqing@rock-chips.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	clk: rockchip: rk3368: fix parents of video encoder/decoder	Heiko Stuebner
	commit 0f28d98463498c61c61a38aacbf9f69e92e85e9d upstream. The vdpu and vepu clocks can also be parented to the npll and current parent list also is wrong as it would use the npll as "usbphy" source, so adapt the parent to the correct one. Fixes: 3536c97a52db ("clk: rockchip: add rk3368 clock controller") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: zhangqing <zhangqing@rock-chips.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	clk: rockchip: rk3368: fix cpuclk core dividers	Heiko Stuebner
	commit c6d5fe2ca8286f35a79f7345c9378c39d48a1527 upstream. Similar to commit 9880d4277f6a ("clk: rockchip: fix rk3288 cpuclk core dividers") it seems the cpuclk dividers are one to high on the rk3368 as well. And again similar to the previous fix, we opt to make the divider list contain the values to be written to use the same paradigm for them on all supported socs. Fixes: 3536c97a52db ("clk: rockchip: add rk3368 clock controller") Reported-by: Zhang Qing <zhangqing@rock-chips.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: zhangqing <zhangqing@rock-chips.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	clk: rockchip: rk3368: fix cpuclk mux bit of big cpu-cluster	Heiko Stuebner
	commit 535ebd428aeb07c3327947281306f2943f2c9faa upstream. Both clusters have their mux bit in bit 7 of their respective register. For whatever reason the big cluster currently lists bit 15 which is definitly wrong. Fixes: 3536c97a52db ("clk: rockchip: add rk3368 clock controller") Reported-by: Zhang Qing <zhangqing@rock-chips.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: zhangqing <zhangqing@rock-chips.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: atmel-mci: Check pdata for NULL before dereferencing it at DMA config	Brent Taylor
	commit 93c77d2999b09f2084b033ea6489915e0104ad9c upstream. Using an at91sam9g20ek development board with DTS configuration may trigger a kernel panic because of a NULL pointer dereference exception, while configuring DMA. Let's fix this by adding a check for pdata before dereferencing it. Signed-off-by: Brent Taylor <motobud@gmail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: Fix override of timeout clk wrt max_busy_timeout	Adrian Hunter
	commit 995136247915c5cee633d55ba23f6eebf67aa567 upstream. Normally the timeout clock frequency is read from the capabilities register. It is also possible to set the value prior to calling sdhci_add_host() in which case that value will override the capabilities register value. However that was being done after calculating max_busy_timeout so that max_busy_timeout was being calculated using the wrong value of timeout_clk. Fix that by moving the override before max_busy_timeout is calculated. The result is that the max_busy_timeout and max_discard increase for BSW devices so that, for example, the time for mkfs.ext4 on a 64GB eMMC drops from about 1 minute 40 seconds to about 20 seconds. Note, in the future, the capabilities setting will be tidied up and this override won't be used anymore. However this fix is needed for stable. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: tegra: properly disable card clock	Lucas Stach
	commit 3491b69045b1926a198ba70dc1296ca253f2fbdd upstream. The new code to do the clock rate setting externally to the SDMMC module has a shortcut to not propagate changes with a 0 rate to the CAR by simply bailing out. This breaks proper cutting of the card clock. Fix it by directly calling the correct sdhci function. Fixes: a8e326a911d3 "mmc: tegra: implement module external clock change" Signed-off-by: Lucas Stach <dev@lynxeye.de> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: tegra: Disable UHS-I modes for tegra114	Jon Hunter
	commit 7bf037d6ac4768e228e337afd7b6c6d98f947f9f upstream. SD card support for Tegra114 started failing after commit a8e326a911d3 ("mmc: tegra: implement module external clock change") was merged. This commit was part of a series to enable UHS-I modes for Tegra. To workaround this problem for now, disable UHS-I modes for Tegra114 by separating the soc data structures for Tegra114 and Tegra124 so that UHS-I is still enabled for Tegra124 but not Tegra114. Fixes: a8e326a911d3 ("mmc: tegra: implement module external clock change") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Lucas Stach <dev@lynxeye.de> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci-pxav3: fix higher speed mode capabilities	Russell King
	commit 0ca33b4ad9cfc133bb3d93eec1ad0eea83d6f252 upstream. Commit 1140011ee9d9 ("mmc: sdhci-pxav3: Modify clock settings for the SDR50 and DDR50 modes") broke any chance of the SDR50 or DDR50 modes being used. The commit claims that SDR50 and DDR50 require clock adjustments in the SDIO3 Configuration register, which is located via the "conf-sdio3" resource. However, when this resource is given, we fail to read the host capabilities 1 register, resulting in host->caps1 being zero. Hence, both SDHCI_SUPPORT_SDR50 and SDHCI_SUPPORT_DDR50 bits remain zero, disabling the SDR50 and DDR50 modes. The underlying idea in this function appears to be to read the device capabilities, modify them, and set SDHCI_QUIRK_MISSING_CAPS to cause our modified capabilities to be used. Implement exactly that. Fixes: 1140011ee9d9 ("mmc: sdhci-pxav3: Modify clock settings for the SDR50 and DDR50 modes") Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: fix data timeout (part 2)	Russell King
	commit 7f05538af71c7d30b5fc821cbe9f318edc645961 upstream. The calculation for the timeout based on the number of card clocks is incorrect. The calculation assumed: timeout in microseconds = clock cycles / clock in Hz which is clearly a several orders of magnitude wrong. Fix this by multiplying the clock cycles by 1000000 prior to dividing by the Hz based clock. Also, as per part 1, ensure that the division rounds up. As this needs 64-bit math via do_div(), avoid it if the clock cycles is zero. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: fix data timeout (part 1)	Russell King
	commit fafcfda9e78cae8796d1799f14e6457790797555 upstream. The data timeout gives the minimum amount of time that should be waited before timing out if no data is received from the card. Simply dividing the nanosecond part by 1000 does not give this required guarantee, since such a division rounds down. Use DIV_ROUND_UP() to give the desired timeout. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: plug DMA mapping leak on error	Russell King
	commit 054cedff5e025a54ceefff891c6ea42ee8b37eab upstream. If we terminate a command early, we fail to properly clean up the DMA mappings for the data part of the request. Put this clean up to the tasklet, which is the common path for finishing a request so we always clean up after ourselves. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> [ Split original patch so that it now contains only the fix ] Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: avoid unnecessary mapping/unmapping of align buffer	Russell King
	commit edd63fcc97cdb53279a7c43fa1691f5913d92793 upstream. Unnecessarily mapping and unmapping the align buffer for SD cards is expensive: performance measurements on iMX6 show that this gives a hit of 10% on hdparm buffered disk reads. MMC/SD card IO comes from the mm/vfs which gives us page based IO, so for this case, the align buffer is not going to be used. However, we still map and unmap this buffer. Eliminate this by switching the align buffer to be a DMA coherent buffer, which needs no DMA maintenance to access the buffer. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: further fix for DMA unmapping in sdhci_post_req()	Russell King
	commit 771a3dc225815b7cc691c1ce703a3af8488e48df upstream. sdhci_post_req() exists to unmap a previously mapped but already finished request, while the next request is in progress. However, the state of the SDHCI_REQ_USE_DMA flag depends on the last submitted request. This means we can end up clearing the flag due to a quirk, which then means that sdhci_post_req() fails to unmap the DMA buffer, potentially leading to data corruption. We can safely ignore the SDHCI_REQ_USE_DMA here, as testing data->host_cookie is entirely sufficient. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> [ Re-based to apply as a separate fix ] Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: fix command response CRC error handling	Russell King
	commit 71fcbda0fcddd0896c4982a484f6c8aa802d28b1 upstream. When we get a response CRC error on a command, it means that the response we received back from the card was not correct. It does not mean that the card did not receive the command correctly. If the command is one which initiates a data transfer, the card can enter the data transfer state, and start sending data. Moreover, if the request contained a data phase, we do not clean this up, and this results in the driver triggering DMA API debug warnings, and also creates a race condition in the driver, between running the finish_tasklet and the data transfer interrupts, which can trigger a "Got data interrupt" state dump. Fix this by handing a response CRC error slightly differently: record the failure of the data initiating command, but allow the remainder of the request to be processed normally. This is safe as core MMC checks the status of all commands and data transfer phases of the request. If the card does not initiate a data transfer, then we should time out according to the data transfer parameters. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> [ Fix missing parenthesis around bitwise-AND expression, and tweak subject ] Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: clean up command error handling	Russell King
	commit ec014cbacf6229c583cb832726ca39be1ae3d8c3 upstream. Avoid multiple tests while handling a command error; simplify the code. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> [ Goes with "mmc: sdhci: fix command response CRC error handling" ] Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: sdhci: move initialisation of command error member	Russell King
	commit 96776200898cf9c1965b9f8b9a128e94bb6dce18 upstream. When a command is started, logically it has no error. Initialise the command's error member to zero whenever we start a command. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> [ Goes with "mmc: sdhci: fix command response CRC error handling" ] Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12	mmc: mmc_spi: Add Card Detect comments and fix CD GPIO case	Magnus Damm
	commit bcdc9f260bdce09913db1464be9817170d51044a upstream. This patch fixes the MMC SPI driver from doing polling card detect when a CD GPIO that supports interrupts is specified using the gpios DT property. Without this patch the DT node below results in the following output: spi_gpio: spi-gpio { /* SD2 @ CN12 / compatible = "spi-gpio"; #address-cells = <1>; #size-cells = <0>; gpio-sck = <&gpio6 16 GPIO_ACTIVE_HIGH>; gpio-mosi = <&gpio6 17 GPIO_ACTIVE_HIGH>; gpio-miso = <&gpio6 18 GPIO_ACTIVE_HIGH>; num-chipselects = <1>; cs-gpios = <&gpio6 21 GPIO_ACTIVE_LOW>; status = "okay"; spi@0 { compatible = "mmc-spi-slot"; reg = <0>; voltage-ranges = <3200 3400>; spi-max-frequency = <25000000>; gpios = <&gpio6 22 GPIO_ACTIVE_LOW>; / CD */ }; }; # dmesg \| grep mmc mmc_spi spi32766.0: SD/MMC host mmc0, no WP, no poweroff, cd polling mmc0: host does not support reading read-only switch, assuming write-enable mmc0: new SDHC card on SPI mmcblk0: mmc0:0000 SU04G 3.69 GiB mmcblk0: p1 With this patch applied the "cd polling" portion above disappears. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>