summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-02-16KVM: PPC: Account TCE-containing pages in locked_vmAlexey Kardashevskiy
At the moment pages used for TCE tables (in addition to pages addressed by TCEs) are not counted in locked_vm counter so a malicious userspace tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and lock a lot of memory. This adds counting for pages used for TCE tables. This counts the number of pages required for a table plus pages for the kvmppc_spapr_tce_table struct (TCE table descriptor) itself. This changes release_spapr_tce_table() to store @npages on stack to avoid calling kvmppc_stt_npages() in the loop (tiny optimization, probably). This does not change the amount of used memory. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-16KVM: PPC: Use RCU for arch.spapr_tce_tablesAlexey Kardashevskiy
At the moment only spapr_tce_tables updates are protected against races but not lookups. This fixes missing protection by using RCU for the list. As lookups also happen in real mode, this uses list_for_each_entry_lockless() (which is expected not to access any vmalloc'd memory). This converts release_spapr_tce_table() to a RCU scheduled handler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-16KVM: PPC: Rework H_PUT_TCE/H_GET_TCE handlersAlexey Kardashevskiy
This reworks the existing H_PUT_TCE/H_GET_TCE handlers to have following patches applied nicer. This moves the ioba boundaries check to a helper and adds a check for least bits which have to be zeros. The patch is pretty mechanical (only check for least ioba bits is added) so no change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-16powerpc: Make vmalloc_to_phys() publicAlexey Kardashevskiy
This makes vmalloc_to_phys() public as there will be another user (KVM in-kernel VFIO acceleration) for it soon. As this new user can be compiled as a module, this exports the symbol. As a little optimization, this changes the helper to call vmalloc_to_pfn() instead of vmalloc_to_page() as the size of the struct page may not be power-of-two aligned which will make gcc use multiply instructions instead of shifts. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-15vfio: Enable VFIO device for powerpcDavid Gibson
ec53500f "kvm: Add VFIO device" added a special KVM pseudo-device which is used to handle any necessary interactions between KVM and VFIO. Currently that device is built on x86 and ARM, but not powerpc, although powerpc does support both KVM and VFIO. This makes things awkward in userspace Currently qemu prints an alarming error message if you attempt to use VFIO and it can't initialize the KVM VFIO device. We don't want to remove the warning, because lack of the KVM VFIO device could mean coherency problems on x86. On powerpc, however, the error is harmless but looks disturbing, and a test based on host architecture in qemu would be ugly, and break if we do need the KVM VFIO device for something important in future. There's nothing preventing the KVM VFIO device from being built for powerpc, so this patch turns it on. It won't actually do anything, since we don't define any of the arch_*() hooks, but it will make qemu happy and we can extend it in future if we need to. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-10KVM: s390: bail out early on fatal signal in dirty loggingChristian Borntraeger
A KVM_GET_DIRTY_LOG ioctl might take a long time. This can result in fatal signals seemingly being ignored. Lets bail out during the dirty bit sync, if a fatal signal is pending. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: do not block CPU on dirty loggingChristian Borntraeger
When doing dirty logging on huge guests (e.g.600GB) we sometimes get rcu stall timeouts with backtraces like [ 2753.194083] ([<0000000000112fb2>] show_trace+0x12a/0x130) [ 2753.194092] [<0000000000113024>] show_stack+0x6c/0xe8 [ 2753.194094] [<00000000001ee6a8>] rcu_pending+0x358/0xa48 [ 2753.194099] [<00000000001f20cc>] rcu_check_callbacks+0x84/0x168 [ 2753.194102] [<0000000000167654>] update_process_times+0x54/0x80 [ 2753.194107] [<00000000001bdb5c>] tick_sched_handle.isra.16+0x4c/0x60 [ 2753.194113] [<00000000001bdbd8>] tick_sched_timer+0x68/0x90 [ 2753.194115] [<0000000000182a88>] __run_hrtimer+0x88/0x1f8 [ 2753.194119] [<00000000001838ba>] hrtimer_interrupt+0x122/0x2b0 [ 2753.194121] [<000000000010d034>] do_extint+0x16c/0x170 [ 2753.194123] [<00000000005e206e>] ext_skip+0x38/0x3e [ 2753.194129] [<000000000012157c>] gmap_test_and_clear_dirty+0xcc/0x118 [ 2753.194134] ([<00000000001214ea>] gmap_test_and_clear_dirty+0x3a/0x118) [ 2753.194137] [<0000000000132da4>] kvm_vm_ioctl_get_dirty_log+0xd4/0x1b0 [ 2753.194143] [<000000000012ac12>] kvm_vm_ioctl+0x21a/0x548 [ 2753.194146] [<00000000002b57f6>] do_vfs_ioctl+0x30e/0x518 [ 2753.194149] [<00000000002b5a9c>] SyS_ioctl+0x9c/0xb0 [ 2753.194151] [<00000000005e1ae6>] sysc_tracego+0x14/0x1a [ 2753.194153] [<000003ffb75f3972>] 0x3ffb75f3972 We should do a cond_resched in here. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: do not take mmap_sem on dirty log queryChristian Borntraeger
Dirty log query can take a long time for huge guests. Holding the mmap_sem for very long times can cause some unwanted latencies. Turns out that we do not need to hold the mmap semaphore. We hold the slots_lock for gfn->hva translation and walk the page tables with that address, so no need to look at the VMAs. KVM also holds a reference to the mm, which should prevent other things going away. During the walk we take the necessary ptl locks. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: usage hint for adapter mappingsCornelia Huck
The interface for adapter mappings was designed with code in mind that maps each address only once; let's document this. Otherwise, duplicate mappings are added to the list, which makes the code ineffective and uses up the limited amount of mapping needlessly. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: add documentation of KVM_S390_VM_CRYPTODavid Hildenbrand
Let's properly document KVM_S390_VM_CRYPTO and its attributes. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: add documentation of KVM_S390_VM_TODDavid Hildenbrand
Let's properly document KVM_S390_VM_TOD and its attributes. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: remove old fragment of vector registersDavid Hildenbrand
Since commit 9977e886cbbc ("s390/kernel: lazy restore fpu registers"), vregs in struct sie_page is unsed. We can safely remove the field and the definition. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: instruction-fetching exceptions on SIE faultsDavid Hildenbrand
On instruction-fetch exceptions, we have to forward the PSW by any valid ilc and correctly use that ilc when injecting the irq. Injection will already take care of rewinding the PSW if we injected a nullifying program irq, so we don't need special handling prior to injection. Until now, autodetection would have guessed an ilc of 0. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: provide prog irq ilc on SIE faultsDavid Hildenbrand
On SIE faults, the ilc cannot be detected automatically, as the icptcode is 0. The ilc indicated in the program irq will always be 0. Therefore we have to manually specify the ilc in order to tell the guest which ilen was used when forwarding the PSW. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: irq delivery should not rely on icptcodeDavid Hildenbrand
Program irq injection during program irq intercepts is the last candidates that injects nullifying irqs and relies on delivery to do the right thing. As we should not rely on the icptcode during any delivery (because that value will not be migrated), let's add a flag, telling prog IRQ delivery to not rewind the PSW in case of nullifying prog IRQs. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: clean up prog irq injection on prog irq icptsDavid Hildenbrand
__extract_prog_irq() is used only once for getting the program check data in one place. Let's combine it with an injection function to avoid a memset and to prevent misuse on injection by simplifying the interface to only have the VCPU as parameter. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: read the correct opcode on SIE faultsDavid Hildenbrand
Let's use our fresh new function read_guest_instr() to access guest storage via the correct addressing schema. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: gaccess: implement instruction fetching modeDavid Hildenbrand
When an instruction is to be fetched, special handling applies to secondary-space mode and access-register mode. The instruction is to be fetched from primary space. We can easily support this by selecting the right asce for translation. Access registers will never be used during translation, so don't include them in the interface. As we only want to read from the current PSW address for now, let's also hide that detail. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: gaccess: introduce access modesDavid Hildenbrand
We will need special handling when fetching instructions, so let's introduce new guest access modes GACC_FETCH and GACC_STORE instead of a write flag. An additional patch will then introduce GACC_IFETCH. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: migration / injection of prog irq ilcDavid Hildenbrand
We have to migrate the program irq ilc and someday we will have to specify the ilc without KVM trying to autodetect the value. Let's reuse one of the spare fields in our program irq that should always be set to 0 by user space. Because we also want to make use of 0 ilcs ("not available"), we need a validity indicator. If no valid ilc is given, we try to autodetect the ilc via the current icptcode and icptstatus + parameter and store the valid ilc in the irq structure. This has a nice effect: QEMU's making use of KVM_S390_IRQ / KVM_S390_SET_IRQ_STATE / KVM_S390_GET_IRQ_STATE for migration will directly migrate the ilc without any changes. Please note that we use bit 0 as validity and bit 1,2 for the ilc, so by applying the ilc mask we directly get the ilen which is usually what we work with. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: PSW forwarding / rewinding / ilc reworkDavid Hildenbrand
We have some confusion about ilc vs. ilen in our current code. So let's correctly use the term ilen when dealing with (ilc << 1). Program irq injection didn't take care of the correct ilc in case of irqs triggered by EXECUTE functions, let's provide one function kvm_s390_get_ilen() to take care of all that. Also, manually specifying in intercept handlers the size of the instruction (and sometimes overwriting that value for EXECUTE internally) doesn't make too much sense. So also provide the functions: - kvm_s390_retry_instr to retry the currently intercepted instruction - kvm_s390_rewind_psw to rewind the PSW without internal overwrites - kvm_s390_forward_psw to forward the PSW Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: sync of fp registers via kvm_runDavid Hildenbrand
As we already store the floating point registers in the vector save area in floating point register format when we don't have MACHINE_HAS_VX, we can directly expose them to user space using a new sync flag. The floating point registers will be valid when KVM_SYNC_FPRS is set. The fpc will also be valid when KVM_SYNC_FPRS is set. Either KVM_SYNC_FPRS or KVM_SYNC_VRS will be enabled, never both. Let's also change two positions where we access vrs, making the code easier to read and one comment superfluous. Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: allow sync of fp registers via vregsDavid Hildenbrand
If we have MACHINE_HAS_VX, the floating point registers are stored in the vector register format, event if the guest isn't enabled for vector registers. So we can allow KVM_SYNC_VRS as soon as MACHINE_HAS_VX is available. This can in return be used by user space to support floating point registers via struct kvm_run when the machine has vector registers. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-09KVM: x86: consolidate different ways to test for in-kernel LAPICPaolo Bonzini
Different pieces of code checked for vcpu->arch.apic being (non-)NULL, or used kvm_vcpu_has_lapic (more optimized) or lapic_in_kernel. Replace everything with lapic_in_kernel's name and kvm_vcpu_has_lapic's implementation. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: consolidate "has lapic" checks into irq.cPaolo Bonzini
Do for kvm_cpu_has_pending_timer and kvm_inject_pending_timer_irqs what the other irq.c routines have been doing. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: APIC: remove unnecessary double checks on APIC existencePaolo Bonzini
Usually the in-kernel APIC's existence is checked in the caller. Do not bother checking it again in lapic.c. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM/VMX: Add host irq information in trace event when updating IRTE for ↵Feng Wu
posted interrupts Add host irq information in trace event, so we can better understand which irq is in posted mode. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: Add lowest-priority support for vt-d posted-interruptsFeng Wu
Use vector-hashing to deliver lowest-priority interrupts for VT-d posted-interrupts. This patch extends kvm_intr_is_single_vcpu() to support lowest-priority handling. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: Use vector-hashing to deliver lowest-priority interruptsFeng Wu
Use vector-hashing to deliver lowest-priority interrupts, As an example, modern Intel CPUs in server platform use this method to handle lowest-priority interrupts. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: Recover IRTE to remapped mode if the interrupt is not single-destinationFeng Wu
When the interrupt is not single destination any more, we need to change back IRTE to remapped mode explicitly. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: introduce do_shl32_div32Paolo Bonzini
This is similar to the existing div_frac function, but it returns the remainder too. Unlike div_frac, it can be used to implement long division, e.g. (a << 64) / b for 32-bit a and b. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "KVM-ARM fixes, mostly coming from the PMU work" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: arm64: KVM: Fix guest dead loop when register accessor returns false arm64: KVM: Fix comments of the CP handler arm64: KVM: Fix wrong use of the CPSR MODE mask for 32bit guests arm64: KVM: Obey RES0/1 reserved bits when setting CPTR_EL2 arm64: KVM: Fix AArch64 guest userspace exception injection
2016-02-08Merge tag 'regmap-fix-v4.5-big-endian' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "A single revert back to v4.4 endianness handling. Commit 29bb45f25ff3 ("regmap-mmio: Use native endianness for read/write") attempted to fix some long standing bugs in the MMIO implementation for big endian systems caused by duplicate byte swapping in both regmap and readl()/writel(). Sadly the fix makes things worse rather than better, so revert it for now" * tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: mmio: Revert to v4.4 endianness handling
2016-02-08scatterlist: fix a typo in comment block of sg_miter_stop()Masahiro Yamada
Fix the doubled "started" and tidy up the following sentences. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-08Merge tag 'kvm-arm-for-4.5-rc2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/ARM fixes for v4.5-rc2 A few random fixes, mostly coming from the PMU work by Shannon: - fix for injecting faults coming from the guest's userspace - cleanup for our CPTR_EL2 accessors (reserved bits) - fix for a bug impacting perf (user/kernel discrimination) - fix for a 32bit sysreg handling bug
2016-02-07Linux 4.5-rc3v4.5-rc3Linus Torvalds
2016-02-07Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "The first real batch of fixes for this release cycle, so there are a few more than usual. Most of these are fixes and tweaks to board support (DT bugfixes, etc). I've also picked up a couple of small cleanups that seemed innocent enough that there was little reason to wait (const/ __initconst and Kconfig deps). Quite a bit of the changes on OMAP were due to fixes to no longer write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but there were also other fixes. Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC fixes on OMAP5, and Nomadik had changes to MMC parameters in DT. All in all, mostly the usual mix of various fixes" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits) ARM: multi_v7_defconfig: enable DW_WATCHDOG ARM: nomadik: fix up SD/MMC DT settings ARM64: tegra: Add chosen node for tegra132 norrin ARM: realview: use "depends on" instead of "if" after prompt ARM: tango: use "depends on" instead of "if" after prompt ARM: tango: use const and __initconst for smp_operations ARM: realview: use const and __initconst for smp_operations bus: uniphier-system-bus: revive tristate prompt arm64: dts: Add missing DMA Abort interrupt to Juno bus: vexpress-config: Add missing of_node_put ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2 ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address ARM: dts: LogicPD Torpedo: Revert Duplicative Entries ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types ARM: dts: am4372: fix irq type for arm twd and global timer ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type ...
2016-02-07Merge branch 'mailbox-devel' of ↵Linus Torvalds
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox fixes from Jassi Brar: - fix getting element from the pcc-channels array by simply indexing into it - prevent building mailbox-test driver for archs that don't have IOMEM * 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: Fix dependencies for !HAS_IOMEM archs mailbox: pcc: fix channel calculation in get_pcc_channel()
2016-02-06Merge tag 'usb-4.5-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB fixes for 4.5-rc3. The usual, xhci fixes for reported issues, combined with some small gadget driver fixes, and a MAINTAINERS file update. All have been in linux-next with no reported issues" * tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: harden xhci_find_next_ext_cap against device removal xhci: Fix list corruption in urb dequeue at host removal usb: host: xhci-plat: fix NULL pointer in probe for device tree case usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms usb: xhci: set SSIC port unused only if xhci_suspend succeeds usb: xhci: add a quirk bit for ssic port unused usb: xhci: handle both SSIC ports in PME stuck quirk usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver. Revert "xhci: don't finish a TD if we get a short-transfer event mid TD" MAINTAINERS: fix my email address usb: dwc2: Fix probe problem on bcm2835 Revert "usb: dwc2: Move reset into dwc2_get_hwparams()" usb: musb: ux500: Fix NULL pointer dereference at system PM usb: phy: mxs: declare variable with initialized value usb: phy: msm: fix error handling in probe.
2016-02-06Merge tag 'staging-4.5-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver fixes from Greg KH: "Here are some IIO and staging driver fixes for 4.5-rc3. All of them, except one, are for IIO drivers, and one is for a speakup driver fix caused by some earlier patches, to resolve a reported build failure" * tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: Staging: speakup: Fix allyesconfig build on mn10300 iio: dht11: Use boottime iio: ade7753: avoid uninitialized data iio: pressure: mpl115: fix temperature offset sign iio: imu: Fix dependencies for !HAS_IOMEM archs staging: iio: Fix dependencies for !HAS_IOMEM archs iio: adc: Fix dependencies for !HAS_IOMEM archs iio: inkern: fix a NULL dereference on error iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer. iio: light: acpi-als: Report data as processed iio: dac: mcp4725: set iio name property in sysfs iio: add HAS_IOMEM dependency to VF610_ADC iio: add IIO_TRIGGER dependency to STK8BA50 iio: proximity: lidar: correct return value iio-light: Use a signed return type for ltr501_match_samp_freq()
2016-02-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "22 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits) epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT radix-tree: fix oops after radix_tree_iter_retry MAINTAINERS: trim the file triggers for ABI/API dax: dirty inode only if required thp: make deferred_split_scan() work again mm: replace vma_lock_anon_vma with anon_vma_lock_read/write ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup um: asm/page.h: remove the pte_high member from struct pte_t mm, hugetlb: don't require CMA for runtime gigantic pages mm/hugetlb: fix gigantic page initialization/allocation mm: downgrade VM_BUG in isolate_lru_page() to warning mempolicy: do not try to queue pages from !vma_migratable() mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress vmstat: make vmstat_update deferrable mm, vmstat: make quiet_vmstat lighter mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT memblock: don't mark memblock_phys_mem_size() as __init dump_stack: avoid potential deadlocks mm: validate_mm browse_rb SMP race condition m32r: fix build failure due to SMP and MMU ...
2016-02-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "We have a few wire protocol compatibility fixes, ports of a few recent CRUSH mapping changes, and a couple error path fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: MOSDOpReply v7 encoding libceph: advertise support for TUNABLES5 crush: decode and initialize chooseleaf_stable crush: add chooseleaf_stable tunable crush: ensure take bucket value is valid crush: ensure bucket id is valid before indexing buckets array ceph: fix snap context leak in error path ceph: checking for IS_ERR instead of NULL
2016-02-05Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Fixes all over the place: - amdkfd: two static checker fixes - mst: a bunch of static checker and spec/hw interaction fixes - amdgpu: fix Iceland hw properly, and some fiji bugs, along with some write-combining fixes. - exynos: some regression fixes - adv7511: fix some EDID reading issues" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits) drm/dp/mst: deallocate payload on port destruction drm/dp/mst: Reverse order of MST enable and clearing VC payload table. drm/dp/mst: move GUID storage from mgr, port to only mst branch drm/dp/mst: change MST detection scheme drm/dp/mst: Calculate MST PBN with 31.32 fixed point drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil drm/mst: Add range check for max_payloads during init drm/mst: Don't ignore the MST PBN self-test result drm: fix missing reference counting decrease drm/amdgpu: disable uvd and vce clockgating on Fiji drm/amdgpu: remove exp hardware support from iceland drm/amdgpu: load MEC ucode manually on iceland drm/amdgpu: don't load MEC2 on topaz drm/amdgpu: drop topaz support from gmc8 module drm/amdgpu: pull topaz gmc bits into gmc_v7 drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above drm/amdgpu: iceland use CI based MC IP drm/amdgpu: move gmc7 support out of CIK dependency drm/amdgpu/gfx7: enable cp inst/reg error interrupts drm/amdgpu/gfx8: enable cp inst/reg error interrupts ...
2016-02-05Merge tag 'pm+acpi-4.5-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are: a fix for a recently introduced false-positive warnings about PM domain pointers being changed inappropriately (harmless but annoying), an MCH size workaround quirk for one more platform, a compiler warning fix (generic power domains framework), an ACPI LPSS (Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code. Specifics: - PM core fix to avoid false-positive warnings generated when the pm_domain field is cleared for a device that appears to be bound to a driver (Rafael Wysocki). - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer). - Fix for an "unused function" compiler warning in the generic power domains framework (Ulf Hansson). - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM domain pointer of a device properly in one place that was overlooked by a recent PM core update (Andy Shevchenko). - Removal of a redundant function declaration in the ACPI CPPC core code (Timur Tabi)" * tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: Avoid false-positive warnings in dev_pm_domain_set() PM / Domains: Silence compiler warning for an unused function ACPI / CPPC: remove redundant mbox_send_message() declaration ACPI / LPSS: set PM domain via helper setter PNP: Add Haswell-ULT to Intel MCH size workaround
2016-02-05epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUTJason Baron
In the current implementation of the EPOLLEXCLUSIVE flag (added for 4.5-rc1), if epoll waiters create different POLL* sets and register them as exclusive against the same target fd, the current implementation will stop waking any further waiters once it finds the first idle waiter. This means that waiters could miss wakeups in certain cases. For example, when we wake up a pipe for reading we do: wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if one epoll set or epfd is added to pipe p with POLLIN and a second set epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the wakeup since the current implementation will stop after it finds any intersection of events with a waiter that is blocked in epoll_wait(). We could potentially address this by requiring all epoll waiters that are added to p be required to pass the same set of POLL* events. IE the first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL* flags to be used by any other epfds that are added as EPOLLEXCLUSIVE. However, I think it might be somewhat confusing interface as we would have to reference count the number of users for that set, and so userspace would have to keep track of that count, or we would need a more involved interface. It also adds some shared state that we'd have store somewhere. I don't think anybody will want to bloat __wait_queue_head for this. I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE such that it can only be specified with EPOLLIN and/or EPOLLOUT. So that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop once we hit the first idle waiter that specifies the EPOLLIN bit, since any remaining waiters that only have 'POLLOUT' set wouldn't need to be woken. Likewise, we can do the same thing if 'POLLOUT' is in the wakeup bit set and not 'POLLIN'. If both 'POLLOUT' and 'POLLIN' are set in the wake bit set (there is at least one example of this I saw in fs/pipe.c), then we just wake the entire exclusive list. Having both 'POLLOUT' and 'POLLIN' both set should not be on any performance critical path, so I think that's ok (in fs/pipe.c its in pipe_release()). We also continue to include EPOLLERR and EPOLLHUP by default in any exclusive set. Thus, the user can specify EPOLLERR and/or EPOLLHUP but is not required to do so. Since epoll waiters may be interested in other events as well besides EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by doing a 'dup' call on the target fd and adding that as one normally would with EPOLL_CTL_ADD. Since I think that the POLLIN and POLLOUT events are what we are interest in balancing, I think that the 'dup' thing could perhaps be added to only one of the waiter threads. However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be sufficient for the majority of use-cases. Since EPOLLEXCLUSIVE is intended to be used with a target fd shared among multiple epfds, where between 1 and n of the epfds may receive an event, it does not satisfy the semantics of EPOLLONESHOT where only 1 epfd would get an event. Thus, it is not allowed to be specified in conjunction with EPOLLEXCLUSIVE. EPOLL_CTL_MOD is also not allowed if the fd was previously added as EPOLLEXCLUSIVE. It seems with the limited number of flags to not be as interesting, but this could be relaxed at some further point. Signed-off-by: Jason Baron <jbaron@akamai.com> Tested-by: Madars Vitolins <m@silodev.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Eric Wong <normalperson@yhbt.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05radix-tree: fix oops after radix_tree_iter_retryKonstantin Khlebnikov
Helper radix_tree_iter_retry() resets next_index to the current index. In following radix_tree_next_slot current chunk size becomes zero. This isn't checked and it tries to dereference null pointer in slot. Tagged iterator is fine because retry happens only at slot 0 where tag bitmask in iter->tags is filled with single bit. Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup") Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ohad Ben-Cohen <ohad@wizery.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05MAINTAINERS: trim the file triggers for ABI/APIMichael Kerrisk (man-pages)
Commit ea8f8fc8631 ("MAINTAINERS: add linux-api for review of API/ABI changes") added file triggers for various paths that likely indicated API/ABI changes. However, catching all changes in Documentation/ABI/ and include/uapi/ produces a large volume of mail to linux-api, rather than only API/ABI changes. Drop those two entries, but leave include/linux/syscalls.h and kernel/sys_ni.c to catch syscall-related changes. [josh@joshtriplett.org: redid changelog] Signed-off-by: Michael Kerrisk <mtk.man-pages@gmail.com> Acked-by: Shuah khan <shuahkh@osg.samsung.com> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05dax: dirty inode only if requiredDmitry Monakhov
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05thp: make deferred_split_scan() work againKirill A. Shutemov
We need to iterate over split_queue, not local empty list to get anything split from the shrinker. Fixes: e3ae19535c66 ("thp: limit number of object to scan on deferred_split_scan()") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05mm: replace vma_lock_anon_vma with anon_vma_lock_read/writeKonstantin Khlebnikov
Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if anon_vma appeared between lock and unlock. We have to check anon_vma first or call anon_vma_prepare() to be sure that it's here. There are only few users of these legacy helpers. Let's get rid of them. This patch fixes anon_vma lock imbalance in validate_mm(). Write lock isn't required here, read lock is enough. And reorders expand_downwards/expand_upwards: security_mmap_addr() and wrapping-around check don't have to be under anon vma lock. Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>