Age | Commit message (Collapse) | Author |
|
When emulating a nested VM-entry from L1 to L2, several control field
validation checks are deferred to the hardware. Should one of these
validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When
this happens, the L2 guest state is not loaded (even in part), and
execution should continue in L1 with the next instruction after the
VMLAUNCH/VMRESUME.
The VMCS12 is not modified (except for the VM-instruction error
field), the VMCS12 MSR save/load lists are not processed, and the CPU
state is not loaded from the VMCS12 host area. Moreover, the vmcs02
exit reason is stale, so it should not be consulted for any reason.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the
VM-instruction error field of the current VMCS), the launch state of
the current VMCS is not set to "launched," and the VM-exit information
fields of the current VMCS (including IDT-vectoring information and
exit reason) are stale.
On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit
of the exit reason field), the launch state of the current VMCS is not
set to "launched," and only two of the VM-exit information fields of
the current VMCS are modified (exit reason and exit
qualification). The remaining VM-exit information fields of the
current VMCS (including IDT-vectoring information, in particular) are
stale.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
After a successful VM-entry, RFLAGS is cleared, with the exception of
bit 1, which is always set. This is handled by load_vmcs12_host_state.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For example, the following could occur, making us miss a wakeup:
CPU0 CPU1
kvm_vcpu_block kvm_mips_comparecount_func
[L] swait_active(&vcpu->wq)
[S] prepare_to_swait(&vcpu->wq)
[L] if (!kvm_vcpu_has_pending_timer(vcpu))
schedule() [S] queue_timer_int(vcpu)
Ensure that the swait_active() check is not hoisted over the interrupt.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Particularly because kvmppc_fast_vcpu_kick_hv() is a callback,
ensure that we properly serialize wq active checks in order to
avoid potentially missing a wakeup due to racing with the waiter
side.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
During code inspection, the following potential race was seen:
CPU0 CPU1
kvm_async_pf_task_wait apf_task_wake_one
[L] swait_active(&n->wq)
[S] prepare_to_swait(&n.wq)
[L] if (!hlist_unhahed(&n.link))
schedule() [S] hlist_del_init(&n->link);
Properly serialize swait_active() checks such that a wakeup is
not missed.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
A comment might serve future readers.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The value of the guest_irq argument to vmx_update_pi_irte() is
ultimately coming from a KVM_IRQFD API call. Do not BUG() in
vmx_update_pi_irte() if the value is out-of bounds. (Especially,
since KVM as a whole seems to hang after that.)
Instead, print a message only once if we find that we don't have a
route for a certain IRQ (which can be out-of-bounds or within the
array).
This fixes CVE-2017-1000252.
Fixes: efc644048ecde54 ("KVM: x86: Update IRTE for posted-interrupts")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
If L1 does not specify the "use TPR shadow" VM-execution control in
vmcs12, then L0 must specify the "CR8-load exiting" and "CR8-store
exiting" VM-execution controls in vmcs02. Failure to do so will give
the L2 VM unrestricted read/write access to the hardware CR8.
This fixes CVE-2017-12154.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
exceptions simultaneously
qemu-system-x86-8600 [004] d..1 7205.687530: kvm_entry: vcpu 2
qemu-system-x86-8600 [004] .... 7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e
qemu-system-x86-8600 [004] .... 7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0
qemu-system-x86-8600 [004] .... 7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e
qemu-system-x86-8600 [004] .N.. 7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018
kworker/4:2-7814 [004] .... 7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000
qemu-system-x86-8600 [004] .... 7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018
qemu-system-x86-8600 [004] d..1 7205.687711: kvm_entry: vcpu 2
After running some memory intensive workload in guest, I catch the kworker
which completes the GUP too quickly, and queues an "Page Ready" #PF exception
after the "Page not Present" exception before the next vmentry as the above
trace which will result in #DF injected to guest.
This patch fixes it by clearing the queue for "Page not Present" if "Page Ready"
occurs before the next vmentry since the GUP has already got the required page
and shadow page table has already been fixed by "Page Ready" handler.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Fixes: 7c90705bf2a3 ("KVM: Inject asynchronous page fault into a PV guest if page is swapped out.")
[Changed indentation and added clearing of injected. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Bug fixes for stable.
|
|
Don't block vCPU if there is pending exception.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
SVM AVIC hardware accelerates guest write to APIC_EOI register
(for edge-trigger interrupt), which means it does not trap to KVM.
So, only enable SVM AVIC only in split irqchip mode.
(e.g. launching qemu w/ option '-machine kernel_irqchip=split').
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Fixes: 44a95dae1d22 ("KVM: x86: Detect and Initialize AVIC support")
[Removed pr_debug - Radim.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Modify struct kvm_x86_ops.arch.apicv_active() to take struct kvm_vcpu
pointer as parameter in preparation to subsequent changes.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Preparing the base code for subsequent changes. This does not change
existing logic.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Clang resolves __builtin_constant_p() to false even if the expression is
constant in the end. The only purpose of that expression was to
differentiate a case where the following expression couldn't be checked
at compile-time, so we can just remove the check.
Clang handles the following two correctly. Turn it into BUG_ON if there
are any more problems with this.
Fixes: d6321d493319 ("KVM: x86: generalize guest_cpuid_has_ helpers")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
When user space sets kvm_run->immediate_exit, KVM is supposed to
return quickly. However, when a vCPU is in KVM_MP_STATE_UNINITIALIZED,
the value is not considered and the vCPU blocks.
Fix that oversight.
Fixes: 460df4c1fc7c008 ("KVM: race-free exit from KVM_RUN without POSIX signals")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
KVM API says that KVM_RUN will return with -EINTR when a signal is
pending. However, if a vCPU is in KVM_MP_STATE_UNINITIALIZED, then
the return value is unconditionally -EAGAIN.
Copy over some code from vcpu_run(), so that the case of a pending
signal results in the expected return value.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Fixes: f6511935f424 ("KVM: SVM: Add checks for IO instructions")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The commit
9dd21e104bc ('KVM: x86: simplify handling of PKRU')
removed all users and providers of that call-back, but
didn't remove it. Remove it now.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Aneesh Kumar reported seeing host crashes when running recent kernels
on POWER8. The symptom was an oops like this:
Unable to handle kernel paging request for data at address 0xf00000000786c620
Faulting instruction address: 0xc00000000030e1e4
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: powernv_op_panel
CPU: 24 PID: 6663 Comm: qemu-system-ppc Tainted: G W 4.13.0-rc7-43932-gfc36c59 #2
task: c000000fdeadfe80 task.stack: c000000fdeb68000
NIP: c00000000030e1e4 LR: c00000000030de6c CTR: c000000000103620
REGS: c000000fdeb6b450 TRAP: 0300 Tainted: G W (4.13.0-rc7-43932-gfc36c59)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24044428 XER: 20000000
CFAR: c00000000030e134 DAR: f00000000786c620 DSISR: 40000000 SOFTE: 0
GPR00: 0000000000000000 c000000fdeb6b6d0 c0000000010bd000 000000000000e1b0
GPR04: c00000000115e168 c000001fffa6e4b0 c00000000115d000 c000001e1b180386
GPR08: f000000000000000 c000000f9a8913e0 f00000000786c600 00007fff587d0000
GPR12: c000000fdeb68000 c00000000fb0f000 0000000000000001 00007fff587cffff
GPR16: 0000000000000000 c000000000000000 00000000003fffff c000000fdebfe1f8
GPR20: 0000000000000004 c000000fdeb6b8a8 0000000000000001 0008000000000040
GPR24: 07000000000000c0 00007fff587cffff c000000fdec20bf8 00007fff587d0000
GPR28: c000000fdeca9ac0 00007fff587d0000 00007fff587c0000 00007fff587d0000
NIP [c00000000030e1e4] __get_user_pages_fast+0x434/0x1070
LR [c00000000030de6c] __get_user_pages_fast+0xbc/0x1070
Call Trace:
[c000000fdeb6b6d0] [c00000000139dab8] lock_classes+0x0/0x35fe50 (unreliable)
[c000000fdeb6b7e0] [c00000000030ef38] get_user_pages_fast+0xf8/0x120
[c000000fdeb6b830] [c000000000112318] kvmppc_book3s_hv_page_fault+0x308/0xf30
[c000000fdeb6b960] [c00000000010e10c] kvmppc_vcpu_run_hv+0xfdc/0x1f00
[c000000fdeb6bb20] [c0000000000e915c] kvmppc_vcpu_run+0x2c/0x40
[c000000fdeb6bb40] [c0000000000e5650] kvm_arch_vcpu_ioctl_run+0x110/0x300
[c000000fdeb6bbe0] [c0000000000d6468] kvm_vcpu_ioctl+0x528/0x900
[c000000fdeb6bd40] [c0000000003bc04c] do_vfs_ioctl+0xcc/0x950
[c000000fdeb6bde0] [c0000000003bc930] SyS_ioctl+0x60/0x100
[c000000fdeb6be30] [c00000000000b96c] system_call+0x58/0x6c
Instruction dump:
7ca81a14 2fa50000 41de0010 7cc8182a 68c60002 78c6ffe2 0b060000 3cc2000a
794a3664 390610d8 e9080000 7d485214 <e90a0020> 7d435378 790507e1 408202f0
---[ end trace fad4a342d0414aa2 ]---
It turns out that what has happened is that the SLB entry for the
vmmemap region hasn't been reloaded on exit from a guest, and it has
the wrong page size. Then, when the host next accesses the vmemmap
region, it gets a page fault.
Commit a25bd72badfa ("powerpc/mm/radix: Workaround prefetch issue with
KVM", 2017-07-24) modified the guest exit code so that it now only clears
out the SLB for hash guest. The code tests the radix flag and puts the
result in a non-volatile CR field, CR2, and later branches based on CR2.
Unfortunately, the kvmppc_save_tm function, which gets called between
those two points, modifies all the user-visible registers in the case
where the guest was in transactional or suspended state, except for a
few which it restores (namely r1, r2, r9 and r13). Thus the hash/radix indication in CR2 gets corrupted.
This fixes the problem by re-doing the comparison just before the
result is needed. For good measure, this also adds comments next to
the call sites of kvmppc_save_tm and kvmppc_restore_tm pointing out
that non-volatile register state will be lost.
Cc: stable@vger.kernel.org # v4.13
Fixes: a25bd72badfa ("powerpc/mm/radix: Workaround prefetch issue with KVM")
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
|
Commit 468808bd35c4 ("KVM: PPC: Book3S HV: Set process table for HPT
guests on POWER9", 2017-01-30) added a call to kvmppc_update_lpcr()
which doesn't hold the kvm->lock mutex around the call, as required.
This adds the lock/unlock pair, and for good measure, includes
the kvmppc_setup_partition_table() call in the locked region, since
it is altering global state of the VM.
This error appears not to have any fatal consequences for the host;
the consequences would be that the VCPUs could end up running with
different LPCR values, or an update to the LPCR value by userspace
using the one_reg interface could get overwritten, or the update
done by kvmhv_configure_mmu() could get overwritten.
Cc: stable@vger.kernel.org # v4.10+
Fixes: 468808bd35c4 ("KVM: PPC: Book3S HV: Set process table for HPT guests on POWER9")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
|
The XIVE interrupt controller on POWER9 machines doesn't support byte
accesses to any register in the thread management area other than the
CPPR (current processor priority register). In particular, when
reading the PIPR (pending interrupt priority register), we need to
do a 32-bit or 64-bit load.
Cc: stable@vger.kernel.org # v4.13
Fixes: 2c4fb78f78b6 ("KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
|
Pull KVM updates from Radim Krčmář:
"First batch of KVM changes for 4.14
Common:
- improve heuristic for boosting preempted spinlocks by ignoring
VCPUs in user mode
ARM:
- fix for decoding external abort types from guests
- added support for migrating the active priority of interrupts when
running a GICv2 guest on a GICv3 host
- minor cleanup
PPC:
- expose storage keys to userspace
- merge kvm-ppc-fixes with a fix that missed 4.13 because of
vacations
- fixes
s390:
- merge of kvm/master to avoid conflicts with additional sthyi fixes
- wire up the no-dat enhancements in KVM
- multiple epoch facility (z14 feature)
- Configuration z/Architecture Mode
- more sthyi fixes
- gdb server range checking fix
- small code cleanups
x86:
- emulate Hyper-V TSC frequency MSRs
- add nested INVPCID
- emulate EPTP switching VMFUNC
- support Virtual GIF
- support 5 level page tables
- speedup nested VM exits by packing byte operations
- speedup MMIO by using hardware provided physical address
- a lot of fixes and cleanups, especially nested"
* tag 'kvm-4.14-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
KVM: arm/arm64: Support uaccess of GICC_APRn
KVM: arm/arm64: Extract GICv3 max APRn index calculation
KVM: arm/arm64: vITS: Drop its_ite->lpi field
KVM: arm/arm64: vgic: constify seq_operations and file_operations
KVM: arm/arm64: Fix guest external abort matching
KVM: PPC: Book3S HV: Fix memory leak in kvm_vm_ioctl_get_htab_fd
KVM: s390: vsie: cleanup mcck reinjection
KVM: s390: use WARN_ON_ONCE only for checking
KVM: s390: guestdbg: fix range check
KVM: PPC: Book3S HV: Report storage key support to userspace
KVM: PPC: Book3S HV: Fix case where HDEC is treated as 32-bit on POWER9
KVM: PPC: Book3S HV: Fix invalid use of register expression
KVM: PPC: Book3S HV: Fix H_REGISTER_VPA VPA size validation
KVM: PPC: Book3S HV: Fix setting of storage key in H_ENTER
KVM: PPC: e500mc: Fix a NULL dereference
KVM: PPC: e500: Fix some NULL dereferences on error
KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list
KVM: s390: we are always in czam mode
KVM: s390: expose no-DAT to guest and migration support
KVM: s390: sthyi: remove invalid guest write access
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
This fix was intended for 4.13, but didn't get in because both
maintainers were on vacation.
Paul Mackerras:
"It adds mutual exclusion between list_add_rcu and list_del_rcu calls
on the kvm->arch.spapr_tce_tables list. Without this, userspace could
potentially trigger corruption of the list and cause a host crash or
worse."
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull gcc plugins update from Kees Cook:
"This finishes the porting work on randstruct, and introduces a new
option to structleak, both noted below:
- For the randstruct plugin, enable automatic randomization of
structures that are entirely function pointers (along with a couple
designated initializer fixes).
- For the structleak plugin, provide an option to perform zeroing
initialization of all otherwise uninitialized stack variables that
are passed by reference (Ard Biesheuvel)"
* tag 'gcc-plugins-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: structleak: add option to init all vars used as byref args
randstruct: Enable function pointer struct detection
drivers/net/wan/z85230.c: Use designated initializers
drm/amd/powerplay: rv: Use designated initializers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree updates from Rob Herring:
"There's a few orphans in the conversion to %pOF printf specifiers
included here that no one else picked up.
Summary:
- Convert more DT code to use of_property_read_* API.
- Improve DT overlay support when adding multiple overlays
- Convert printk's to %pOF format specifiers. Most went via subsystem
trees, but picked up the remaining orphans
- Correct unittests to use preferred "okay" for "status" property
value
- Add a KASLR seed property
- Vendor prefixes for Mellanox, Theobroma System, Adaptrum, Moxa
- Fix modalias buffer handling
- Clean-up of include paths for building dtbs
- Add bindings for amc6821, isl1208, tsl2x7x, srf02, and srf10
devices
- Add nvmem bindings for MediaTek MT7623 and MT7622 SoC
- Add compatible string for Allwinner H5 Mali-450 GPU
- Fix links to old OpenFirmware docs with new mirror on
devicetree.org
- Remove status property from binding doc examples"
* tag 'devicetree-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (45 commits)
devicetree: Adjust status "ok" -> "okay" under drivers/of/
dt-bindings: Remove "status" from examples
dt-bindings: pinctrl: sh-pfc: Use generic node name
dt-bindings: Add vendor Mellanox
dt-binding: net/phy: fix interrupts description
virt: Convert to using %pOF instead of full_name
macintosh: Convert to using %pOF instead of full_name
ide: pmac: Convert to using %pOF instead of full_name
microblaze: Convert to using %pOF instead of full_name
dt-bindings: usb: musb: Grammar s/the/to/, s/is/are/
of: Use PLATFORM_DEVID_NONE definition
of/device: Fix of_device_get_modalias() buffer handling
of/device: Prevent buffer overflow in of_device_modalias()
dt-bindings: add amc6821, isl1208 trivial bindings
dt-bindings: add vendor prefix for Theobroma Systems
of: search scripts/dtc/include-prefixes path for both CPP and DTC
of: remove arch/$(SRCARCH)/boot/dts from include search path for CPP
of: remove drivers/of/testcase-data from include search path for CPP
of: return of_get_cpu_node from of_cpu_device_node_get if CPUs are not registered
iio: srf08: add device tree binding for srf02 and srf10
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"New Drivers
- RK805 Power Management IC (PMIC)
- ROHM BD9571MWV-M MFD Power Management IC (PMIC)
- Texas Instruments TPS68470 Power Management IC (PMIC) & LEDs
New Device Support:
- Add support for HiSilicon Hi6421v530 to hi6421-pmic-core
- Add support for X-Powers AXP806 to axp20x
- Add support for X-Powers AXP813 to axp20x
- Add support for Intel Sunrise Point LPSS to intel-lpss-pci
New Functionality:
- Amend API to provide register layout; atmel-smc
Fix-ups:
- DT re-work; omap, nokia
- Header file location change {I2C => MFD}; dm355evm_msp, tps65010
- Fix chip ID formatting issue(s); rk808
- Optionally register touchscreen devices; da9052-core
- Documentation improvements; twl-core
- Constification; rtsx_pcr, ab8500-core, da9055-i2c, da9052-spi
- Drop unnecessary static declaration; max8925-i2c
- Kconfig changes (missing deps and remove module support)
- Slim down oversized licence statement; hi6421-pmic-core
- Use managed resources (devm_*); lp87565
- Supply proper error checking/handling; t7l66xb
Bug Fixes:
- Fix counter duplication issue; da9052-core
- Fix potential NULL deference issue; max8998
- Leave SPI-NOR write-protection bit alone; lpc_ich
- Ensure device is put into reset during suspend; intel-lpss
- Correct register offset variable size; omap-usb-tll"
* tag 'mfd-next-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (61 commits)
mfd: intel_soc_pmic: Differentiate between Bay and Cherry Trail CRC variants
mfd: intel_soc_pmic: Export separate mfd-cell configs for BYT and CHT
dt-bindings: mfd: Add bindings for ZII RAVE devices
mfd: omap-usb-tll: Fix register offsets
mfd: da9052: Constify spi_device_id
mfd: intel-lpss: Put I2C and SPI controllers into reset state on suspend
mfd: da9055: Constify i2c_device_id
mfd: intel-lpss: Add missing PCI ID for Intel Sunrise Point LPSS devices
mfd: t7l66xb: Handle return value of clk_prepare_enable
mfd: Add ROHM BD9571MWV-M PMIC DT bindings
mfd: intel_soc_pmic_chtwc: Turn Kconfig option into a bool
mfd: lp87565: Convert to use devm_mfd_add_devices()
mfd: Add support for TPS68470 device
mfd: lpc_ich: Do not touch SPI-NOR write protection bit on Haswell/Broadwell
mfd: syscon: atmel-smc: Add helper to retrieve register layout
mfd: axp20x: Use correct platform device ID for many PEK
dt-bindings: mfd: axp20x: Introduce bindings for AXP813
mfd: axp20x: Add support for AXP813 PMIC
dt-bindings: mfd: axp20x: Add AXP806 to supported list of chips
mfd: Add ROHM BD9571MWV-M MFD PMIC driver
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a new GPIO bit-banging driver implementing PS/2 protocol
- a new power key driver for Rockchip RK805 PMIC
- bunch of patches constifying various device ID structures
- Elan I2C touchpad driver now supports devices with 2 buttons
- other assorted fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (76 commits)
Input: byd - make array seq static, reduces object code size
Input: xilinx_ps2 - fix multiline comment style
Input: pxa27x_keypad - handle return value of clk_prepare_enable
Input: tegra-kbc - handle return value of clk_prepare_enable
Input: PS/2 gpio bit banging driver for serio bus
Input: xen-kbdfront - enable auto repeat for xen keyboard frontend driver
Input: ambakmi - constify amba_id
Input: atmel_mxt_ts - add support for reset line
Input: atmel_mxt_ts - use more managed resources
Input: wacom_w8001 - constify serio_device_id
Input: tsc40 - constify serio_device_id
Input: touchwin - constify serio_device_id
Input: touchright - constify serio_device_id
Input: touchit213 - constify serio_device_id
Input: penmount - constify serio_device_id
Input: mtouch - constify serio_device_id
Input: inexio - constify serio_device_id
Input: hampshire - constify serio_device_id
Input: gunze - constify serio_device_id
Input: fujitsu_ts - constify serio_device_id
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
"Brazil's Independence Day pull request :-)
This is one of the biggest media pull requests, with 625 patches
affecting almost all parts of media (RC, DVB, V4L2, CEC, docs).
This contains:
- A lot of new drivers:
* DVB frontends: mxl5xx, stv0910, stv6111;
* camera flash: as3645a led driver;
* HDMI receiver: adv748X;
* camera sensor: Omnivision 6650 5M driver (ov6650);
* HDMI CEC: ao-cec meson driver;
* V4L2: Qualcom camss driver;
* Remote controller: gpio-ir-tx, pwm-ir-tx and zx-irdec drivers.
- The DDbridge DVB driver got a massive update, with makes it in sync
with modern hardware from that vendor;
- There's an important milestone on this series: the DVB
documentation was written in 2003, but only started to be updated
in 2007. It also used to contain several gaps from the time it was
kept out of tree, mentioning error codes and device nodes that
never existed upstream. On this series, it received a massive
update: all non-deprecated digital TV APIs are now in sync with the
current implementation;
- Some DVB APIs that aren't used by any upstream driver got removed;
- Other parts of the media documentation algo got updated, fixing
some bugs on its PDF output and making it compatible with Sphinx
version 1.6.
As the number of hacks required to build PDF output reduced, I hope
we'll have less troubles as newer versions of our documentation
toolchain are released (famous last words);
- As usual, lots of driver cleanups and improvements"
* tag 'media/v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (624 commits)
media: leds: as3645a: add V4L2_FLASH_LED_CLASS dependency
media: get rid of removed DMX_GET_CAPS and DMX_SET_SOURCE leftovers
media: Revert "[media] v4l: async: make v4l2 coexist with devicetree nodes in a dt overlay"
media: staging: atomisp: sh_css_calloc shall return a pointer to the allocated space
media: Revert "[media] lirc_dev: remove superfluous get/put_device() calls"
media: add qcom_camss.rst to v4l-drivers rst file
media: dvb headers: make checkpatch happier
media: dvb uapi: move frontend legacy API to another part of the book
media: pixfmt-srggb12p.rst: better format the table for PDF output
media: docs-rst: media: Don't use \small for V4L2_PIX_FMT_SRGGB10 documentation
media: index.rst: don't write "Contents:" on PDF output
media: pixfmt*.rst: replace a two dots by a comma
media: vidioc-g-fmt.rst: adjust table format
media: vivid.rst: add a blank line to correct ReST format
media: v4l2 uapi book: get rid of driver programming's chapter
media: format.rst: use the right markup for important notes
media: docs-rst: cardlists: change their format to flat-tables
media: em28xx-cardlist.rst: update to reflect last changes
media: v4l2-event.rst: adjust table to fit on PDF output
media: docs: don't show ToC for each part on PDF output
...
|
|
Pull MMC updates from Ulf Hansson:
"MMC core:
- Continue to refactor the mmc block code to prepare for blkmq
- Move mmc block debugfs into block module
- Next step for eMMC CMDQ by adding a new mmc host interface for it
- Move Kconfig option MMC_DEBUG from core to host
- Some additional minor improvements
MMC host:
- Declare structs as const when applicable
- Explicitly request exclusive reset control when applicable
- Improve some error paths and other various cleanups
- sdhci: Preparations to support SDHCI OMAP
- sdhci: Improve some PM related code
- sdhci: Re-factoring and modernizations
- sdhci-xenon: Add runtime PM and system sleep support
- sdhci-xenon: Add support for eMMC HS400 Enhanced Strobe
- sdhci-cadence: Add system sleep support
- sdhci-of-at91: Improve system sleep support
- dw_mmc: Add support for Hisilicon hi3660
- sunxi: Add support for A83T eMMC
- sunxi: Add support for DDR52 mode
- meson-gx: Add support for UHS-I SD-cards
- meson-gx: Cleanups and improvements
- tmio: Fix CMD12 (STOP) handling
- tmio: Cleanups and improvements
- renesas_sdhi: Add r8a7743/5 support
- renesas-sdhi: Add support for R-Car Gen3 SDHI DMAC
- renesas_sdhi: Cleanups and improvements"
* tag 'mmc-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (145 commits)
mmc: renesas_sdhi: Add r8a7743/5 support
mmc: meson-gx: fix __ffsdi2 undefined on arm32
mmc: sdhci-xenon: add runtime pm support and reimplement standby
mmc: core: Move mmc_start_areq() declaration
mmc: mmci: stop building qcom dml as module
mmc: sunxi: Reset the device at probe time
clk: sunxi-ng: Provide a default reset hook
mmc: meson-gx: rework tuning function
mmc: meson-gx: change default tx phase
mmc: meson-gx: implement voltage switch callback
mmc: meson-gx: use CCF to handle the clock phases
mmc: meson-gx: implement card_busy callback
mmc: meson-gx: simplify interrupt handler
mmc: meson-gx: work around clk-stop issue
mmc: meson-gx: fix dual data rate mode frequencies
mmc: meson-gx: rework clock init function
mmc: meson-gx: rework clk_set function
mmc: meson-gx: rework set_ios function
mmc: meson-gx: cfg init overwrite values
mmc: meson-gx: initialize sane clk default before clock register
...
|
|
Pull block layer updates from Jens Axboe:
"This is the first pull request for 4.14, containing most of the code
changes. It's a quiet series this round, which I think we needed after
the churn of the last few series. This contains:
- Fix for a registration race in loop, from Anton Volkov.
- Overflow complaint fix from Arnd for DAC960.
- Series of drbd changes from the usual suspects.
- Conversion of the stec/skd driver to blk-mq. From Bart.
- A few BFQ improvements/fixes from Paolo.
- CFQ improvement from Ritesh, allowing idling for group idle.
- A few fixes found by Dan's smatch, courtesy of Dan.
- A warning fixup for a race between changing the IO scheduler and
device remova. From David Jeffery.
- A few nbd fixes from Josef.
- Support for cgroup info in blktrace, from Shaohua.
- Also from Shaohua, new features in the null_blk driver to allow it
to actually hold data, among other things.
- Various corner cases and error handling fixes from Weiping Zhang.
- Improvements to the IO stats tracking for blk-mq from me. Can
drastically improve performance for fast devices and/or big
machines.
- Series from Christoph removing bi_bdev as being needed for IO
submission, in preparation for nvme multipathing code.
- Series from Bart, including various cleanups and fixes for switch
fall through case complaints"
* 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits)
kernfs: checking for IS_ERR() instead of NULL
drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set
drbd: Fix allyesconfig build, fix recent commit
drbd: switch from kmalloc() to kmalloc_array()
drbd: abort drbd_start_resync if there is no connection
drbd: move global variables to drbd namespace and make some static
drbd: rename "usermode_helper" to "drbd_usermode_helper"
drbd: fix race between handshake and admin disconnect/down
drbd: fix potential deadlock when trying to detach during handshake
drbd: A single dot should be put into a sequence.
drbd: fix rmmod cleanup, remove _all_ debugfs entries
drbd: Use setup_timer() instead of init_timer() to simplify the code.
drbd: fix potential get_ldev/put_ldev refcount imbalance during attach
drbd: new disk-option disable-write-same
drbd: Fix resource role for newly created resources in events2
drbd: mark symbols static where possible
drbd: Send P_NEG_ACK upon write error in protocol != C
drbd: add explicit plugging when submitting batches
drbd: change list_for_each_safe to while(list_first_entry_or_null)
drbd: introduce drbd_recv_header_maybe_unplug
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- the new pvcalls backend for routing socket calls from a guest to dom0
- some cleanups of Xen code
- a fix for wrong usage of {get,put}_cpu()
* tag 'for-linus-4.14b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (27 commits)
xen/mmu: set MMU_NORMAL_PT_UPDATE in remap_area_mfn_pte_fn
xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guests
xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()
xen/pvcalls: use WARN_ON(1) instead of __WARN()
xen: remove not used trace functions
xen: remove unused function xen_set_domain_pte()
xen: remove tests for pvh mode in pure pv paths
xen-platform: constify pci_device_id.
xen: cleanup xen.h
xen: introduce a Kconfig option to enable the pvcalls backend
xen/pvcalls: implement write
xen/pvcalls: implement read
xen/pvcalls: implement the ioworker functions
xen/pvcalls: disconnect and module_exit
xen/pvcalls: implement release command
xen/pvcalls: implement poll command
xen/pvcalls: implement accept command
xen/pvcalls: implement listen command
xen/pvcalls: implement bind command
xen/pvcalls: implement connect command
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Nothing really major this release, despite quite a lot of activity.
Just lots of things all over the place.
Some things of note include:
- Access via perf to a new type of PMU (IMC) on Power9, which can
count both core events as well as nest unit events (Memory
controller etc).
- Optimisations to the radix MMU TLB flushing, mostly to avoid
unnecessary Page Walk Cache (PWC) flushes when the structure of the
tree is not changing.
- Reworks/cleanups of do_page_fault() to modernise it and bring it
closer to other architectures where possible.
- Rework of our page table walking so that THP updates only need to
send IPIs to CPUs where the affected mm has run, rather than all
CPUs.
- The size of our vmalloc area is increased to 56T on 64-bit hash MMU
systems. This avoids problems with the percpu allocator on systems
with very sparse NUMA layouts.
- STRICT_KERNEL_RWX support on PPC32.
- A new sched domain topology for Power9, to capture the fact that
pairs of cores may share an L2 cache.
- Power9 support for VAS, which is a new mechanism for accessing
coprocessors, and initial support for using it with the NX
compression accelerator.
- Major work on the instruction emulation support, adding support for
many new instructions, and reworking it so it can be used to
implement the emulation needed to fixup alignment faults.
- Support for guests under PowerVM to use the Power9 XIVE interrupt
controller.
And probably that many things again that are almost as interesting,
but I had to keep the list short. Plus the usual fixes and cleanups as
always.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Andreas Schwab,
Aneesh Kumar K.V, Anju T Sudhakar, Arvind Yadav, Balbir Singh,
Benjamin Herrenschmidt, Bhumika Goyal, Breno Leitao, Bryant G. Ly,
Christophe Leroy, Cédric Le Goater, Dan Carpenter, Dou Liyang,
Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff Levand, Hannes
Reinecke, Haren Myneni, Ivan Mikhaylov, John Allen, Julia Lawall,
LABBE Corentin, Laurentiu Tudor, Madhavan Srinivasan, Markus Elfring,
Masahiro Yamada, Matt Brown, Michael Neuling, Murilo Opsfelder Araujo,
Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran,
Paul Mackerras, Rashmica Gupta, Rob Herring, Rui Teng, Sam Bobroff,
Santosh Sivaraj, Scott Wood, Shilpasri G Bhat, Sukadev Bhattiprolu,
Suraj Jitindar Singh, Tobin C. Harding, Victor Aoqui"
* tag 'powerpc-4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (321 commits)
powerpc/xive: Fix section __init warning
powerpc: Fix kernel crash in emulation of vector loads and stores
powerpc/xive: improve debugging macros
powerpc/xive: add XIVE Exploitation Mode to CAS
powerpc/xive: introduce H_INT_ESB hcall
powerpc/xive: add the HW IRQ number under xive_irq_data
powerpc/xive: introduce xive_esb_write()
powerpc/xive: rename xive_poke_esb() in xive_esb_read()
powerpc/xive: guest exploitation of the XIVE interrupt controller
powerpc/xive: introduce a common routine xive_queue_page_alloc()
powerpc/sstep: Avoid used uninitialized error
axonram: Return directly after a failed kzalloc() in axon_ram_probe()
axonram: Improve a size determination in axon_ram_probe()
axonram: Delete an error message for a failed memory allocation in axon_ram_probe()
powerpc/powernv/npu: Move tlb flush before launching ATSD
powerpc/macintosh: constify wf_sensor_ops structures
powerpc/iommu: Use permission-specific DEVICE_ATTR variants
powerpc/eeh: Delete an error out of memory message at init time
powerpc/mm: Use seq_putc() in two functions
macintosh: Convert to using %pOF instead of full_name
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- Transparently fall back to other poweroff method(s) if EFI poweroff
fails (and returns)
- Use separate PE/COFF section headers for the RX and RW parts of the
ARM stub loader so that the firmware can use strict mapping
permissions
- Add support for requesting the firmware to wipe RAM at warm reboot
- Increase the size of the random seed obtained from UEFI so CRNG
fast init can complete earlier
- Update the EFI framebuffer address if it points to a BAR that gets
moved by the PCI resource allocation code
- Enable "reset attack mitigation" of TPM environments: this is
enabled if the kernel is configured with
CONFIG_RESET_ATTACK_MITIGATION=y.
- Clang related fixes
- Misc cleanups, constification, refactoring, etc"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/bgrt: Use efi_mem_type()
efi: Move efi_mem_type() to common code
efi/reboot: Make function pointer orig_pm_power_off static
efi/random: Increase size of firmware supplied randomness
efi/libstub: Enable reset attack mitigation
firmware/efi/esrt: Constify attribute_group structures
firmware/efi: Constify attribute_group structures
firmware/dcdbas: Constify attribute_group structures
arm/efi: Split zImage code and data into separate PE/COFF sections
arm/efi: Replace open coded constants with symbolic ones
arm/efi: Remove pointless dummy .reloc section
arm/efi: Remove forbidden values from the PE/COFF header
drivers/fbdev/efifb: Allow BAR to be moved instead of claiming it
efi/reboot: Fall back to original power-off method if EFI_RESET_SHUTDOWN returns
efi/arm/arm64: Add missing assignment of efi.config_table
efi/libstub/arm64: Set -fpie when building the EFI stub
efi/libstub/arm64: Force 'hidden' visibility for section markers
efi/libstub/arm64: Use hidden attribute for struct screen_info reference
efi/arm: Don't mark ACPI reclaim memory as MEMBLOCK_NOMAP
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
KVM/PPC update for 4.14
There are various minor fixes and cleanups. The only new feature is
that we now export information about storage key support to userspace,
so it can advertise it to the guest.
I have pulled in Michael Ellerman's topic/ppc-kvm branch from the
powerpc tree to get a couple of fixes that touch both KVM PPC code and
other PPC code. That's why there is some arch/powerpc stuff in the
diffstat that isn't arch/powerpc/kvm.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
"The main changes include various Hyper-V optimizations such as faster
hypercalls and faster/better TLB flushes - and there's also some
Intel-MID cleanups"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracing/hyper-v: Trace hyperv_mmu_flush_tlb_others()
x86/hyper-v: Support extended CPU ranges for TLB flush hypercalls
x86/platform/intel-mid: Make several arrays static, to make code smaller
MAINTAINERS: Add missed file for Hyper-V
x86/hyper-v: Use hypercall for remote TLB flush
hyper-v: Globalize vp_index
x86/hyper-v: Implement rep hypercalls
hyper-v: Use fast hypercall for HVCALL_SIGNAL_EVENT
x86/hyper-v: Introduce fast hypercall implementation
x86/hyper-v: Make hv_do_hypercall() inline
x86/hyper-v: Include hyperv/ only when CONFIG_HYPERV is set
x86/platform/intel-mid: Make 'bt_sfi_data' const
x86/platform/intel-mid: Make IRQ allocation a bit more flexible
x86/platform/intel-mid: Group timers callbacks together
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/ARM Changes for v4.14
Two minor cleanups and improvements, a fix for decoding external abort
types from guests, and added support for migrating the active priority
of interrupts when running a GICv2 guest on a GICv3 host.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: Fixes and features for 4.14
- merge of topic branch tlb-flushing from the s390 tree to get the
no-dat base features
- merge of kvm/master to avoid conflicts with additional sthyi fixes
- wire up the no-dat enhancements in KVM
- multiple epoch facility (z14 feature)
- Configuration z/Architecture Mode
- more sthyi fixes
- gdb server range checking fix
- small code cleanups
|
|
Merge updates from Andrew Morton:
- various misc bits
- DAX updates
- OCFS2
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (119 commits)
mm,fork: introduce MADV_WIPEONFORK
x86,mpx: make mpx depend on x86-64 to free up VMA flag
mm: add /proc/pid/smaps_rollup
mm: hugetlb: clear target sub-page last when clearing huge page
mm: oom: let oom_reap_task and exit_mmap run concurrently
swap: choose swap device according to numa node
mm: replace TIF_MEMDIE checks by tsk_is_oom_victim
mm, oom: do not rely on TIF_MEMDIE for memory reserves access
z3fold: use per-cpu unbuddied lists
mm, swap: don't use VMA based swap readahead if HDD is used as swap
mm, swap: add sysfs interface for VMA based swap readahead
mm, swap: VMA based swap readahead
mm, swap: fix swap readahead marking
mm, swap: add swap readahead hit statistics
mm/vmalloc.c: don't reinvent the wheel but use existing llist API
mm/vmstat.c: fix wrong comment
selftests/memfd: add memfd_create hugetlbfs selftest
mm/shmem: add hugetlbfs support to memfd_create()
mm, devm_memremap_pages: use multi-order radix for ZONE_DEVICE lookups
mm/vmalloc.c: halve the number of comparisons performed in pcpu_get_vm_areas()
...
|
|
While debugging a problem, I thought that using
cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be
helpful. It turns out to be counterproductive.
Add a comment documenting how this works.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When Linux brings a CPU down and back up, it switches to init_mm and then
loads swapper_pg_dir into CR3. With PCID enabled, this has the side effect
of masking off the ASID bits in CR3.
This can result in some confusion in the TLB handling code. If we
bring a CPU down and back up with any ASID other than 0, we end up
with the wrong ASID active on the CPU after resume. This could
cause our internal state to become corrupt, although major
corruption is unlikely because init_mm doesn't have any user pages.
More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion
in the next context switch. The result of *that* is a failure to
resume from suspend with probability 1 - 1/6^(cpus-1).
Fix it by reinitializing cpu_tlbstate on resume and CPU bringup.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Jiri Kosina <jikos@kernel.org>
Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Introduce MADV_WIPEONFORK semantics, which result in a VMA being empty
in the child process after fork. This differs from MADV_DONTFORK in one
important way.
If a child process accesses memory that was MADV_WIPEONFORK, it will get
zeroes. The address ranges are still valid, they are just empty.
If a child process accesses memory that was MADV_DONTFORK, it will get a
segmentation fault, since those address ranges are no longer valid in
the child after fork.
Since MADV_DONTFORK also seems to be used to allow very large programs
to fork in systems with strict memory overcommit restrictions, changing
the semantics of MADV_DONTFORK might break existing programs.
MADV_WIPEONFORK only works on private, anonymous VMAs.
The use case is libraries that store or cache information, and want to
know that they need to regenerate it in the child process after fork.
Examples of this would be:
- systemd/pulseaudio API checks (fail after fork) (replacing a getpid
check, which is too slow without a PID cache)
- PKCS#11 API reinitialization check (mandated by specification)
- glibc's upcoming PRNG (reseed after fork)
- OpenSSL PRNG (reseed after fork)
The security benefits of a forking server having a re-inialized PRNG in
every child process are pretty obvious. However, due to libraries
having all kinds of internal state, and programs getting compiled with
many different versions of each library, it is unreasonable to expect
calling programs to re-initialize everything manually after fork.
A further complication is the proliferation of clone flags, programs
bypassing glibc's functions to call clone directly, and programs calling
unshare, causing the glibc pthread_atfork hook to not get called.
It would be better to have the kernel take care of this automatically.
The patch also adds MADV_KEEPONFORK, to undo the effects of a prior
MADV_WIPEONFORK.
This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:
https://man.openbsd.org/minherit.2
[akpm@linux-foundation.org: numerically order arch/parisc/include/uapi/asm/mman.h #defines]
Link: http://lkml.kernel.org/r/20170811212829.29186-3-riel@redhat.com
Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Florian Weimer <fweimer@redhat.com>
Reported-by: Colm MacCártaigh <colm@allcosts.net>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Cc: <linux-api@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "mm,fork,security: introduce MADV_WIPEONFORK", v4.
If a child process accesses memory that was MADV_WIPEONFORK, it will get
zeroes. The address ranges are still valid, they are just empty.
If a child process accesses memory that was MADV_DONTFORK, it will get a
segmentation fault, since those address ranges are no longer valid in
the child after fork.
Since MADV_DONTFORK also seems to be used to allow very large programs
to fork in systems with strict memory overcommit restrictions, changing
the semantics of MADV_DONTFORK might break existing programs.
The use case is libraries that store or cache information, and want to
know that they need to regenerate it in the child process after fork.
Examples of this would be:
- systemd/pulseaudio API checks (fail after fork) (replacing a getpid
check, which is too slow without a PID cache)
- PKCS#11 API reinitialization check (mandated by specification)
- glibc's upcoming PRNG (reseed after fork)
- OpenSSL PRNG (reseed after fork)
The security benefits of a forking server having a re-inialized PRNG in
every child process are pretty obvious. However, due to libraries
having all kinds of internal state, and programs getting compiled with
many different versions of each library, it is unreasonable to expect
calling programs to re-initialize everything manually after fork.
A further complication is the proliferation of clone flags, programs
bypassing glibc's functions to call clone directly, and programs calling
unshare, causing the glibc pthread_atfork hook to not get called.
It would be better to have the kernel take care of this automatically.
The patchset also adds MADV_KEEPONFORK, to undo the effects of a prior
MADV_WIPEONFORK.
This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:
https://man.openbsd.org/minherit.2
This patch (of 2):
MPX only seems to be available on 64 bit CPUs, starting with Skylake and
Goldmont. Move VM_MPX into the 64 bit only portion of vma->vm_flags, in
order to free up a VMA flag.
Link: http://lkml.kernel.org/r/20170811212829.29186-2-riel@redhat.com
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Colm MacCártaigh <colm@allcosts.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A non-default huge page size can be encoded in the flags argument of the
mmap system call. The definitions for these encodings are in arch
specific header files. However, all architectures use the same values.
Consolidate all the definitions in the primary user header file
(uapi/linux/mman.h). Include definitions for all known huge page sizes.
Use the generic encoding definitions in hugetlb_encode.h as the basis
for these definitions.
Link: http://lkml.kernel.org/r/1501527386-10736-3-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit a7be6e5a7f8d ("mm: drop useless local parameters of
__register_one_node()") removes the last user of parent_node().
The parent_node() macro in METAG architecture is unnecessary.
Remove it for cleanup.
Link: http://lkml.kernel.org/r/1501076076-1974-4-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.14:
API:
- Defer scompress scratch buffer allocation to first use.
- Add __crypto_xor that takes separte src and dst operands.
- Add ahash multiple registration interface.
- Revamped aead/skcipher algif code to fix async IO properly.
Drivers:
- Add non-SIMD fallback code path on ARM for SVE.
- Add AMD Security Processor framework for ccp.
- Add support for RSA in ccp.
- Add XTS-AES-256 support for CCP version 5.
- Add support for PRNG in sun4i-ss.
- Add support for DPAA2 in caam.
- Add ARTPEC crypto support.
- Add Freescale RNGC hwrng support.
- Add Microchip / Atmel ECC driver.
- Add support for STM32 HASH module"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
crypto: af_alg - get_page upon reassignment to TX SGL
crypto: cavium/nitrox - Fix an error handling path in 'nitrox_probe()'
crypto: inside-secure - fix an error handling path in safexcel_probe()
crypto: rockchip - Don't dequeue the request when device is busy
crypto: cavium - add release_firmware to all return case
crypto: sahara - constify platform_device_id
MAINTAINERS: Add ARTPEC crypto maintainer
crypto: axis - add ARTPEC-6/7 crypto accelerator driver
crypto: hash - add crypto_(un)register_ahashes()
dt-bindings: crypto: add ARTPEC crypto
crypto: algif_aead - fix comment regarding memory layout
crypto: ccp - use dma_mapping_error to check map error
lib/mpi: fix build with clang
crypto: sahara - Remove leftover from previous used spinlock
crypto: sahara - Fix dma unmap direction
crypto: af_alg - consolidation of duplicate code
crypto: caam - Remove unused dentry members
crypto: ccp - select CONFIG_CRYPTO_RSA
crypto: ccp - avoid uninitialized variable warning
crypto: serpent - improve __serpent_setkey with UBSAN
...
|
|
Pull networking updates from David Miller:
1) Support ipv6 checksum offload in sunvnet driver, from Shannon
Nelson.
2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
Dumazet.
3) Allow generic XDP to work on virtual devices, from John Fastabend.
4) Add bpf device maps and XDP_REDIRECT, which can be used to build
arbitrary switching frameworks using XDP. From John Fastabend.
5) Remove UFO offloads from the tree, gave us little other than bugs.
6) Remove the IPSEC flow cache, from Florian Westphal.
7) Support ipv6 route offload in mlxsw driver.
8) Support VF representors in bnxt_en, from Sathya Perla.
9) Add support for forward error correction modes to ethtool, from
Vidya Sagar Ravipati.
10) Add time filter for packet scheduler action dumping, from Jamal Hadi
Salim.
11) Extend the zerocopy sendmsg() used by virtio and tap to regular
sockets via MSG_ZEROCOPY. From Willem de Bruijn.
12) Significantly rework value tracking in the BPF verifier, from Edward
Cree.
13) Add new jump instructions to eBPF, from Daniel Borkmann.
14) Rework rtnetlink plumbing so that operations can be run without
taking the RTNL semaphore. From Florian Westphal.
15) Support XDP in tap driver, from Jason Wang.
16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.
17) Add Huawei hinic ethernet driver.
18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
Delalande.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
i40e: avoid NVM acquire deadlock during NVM update
drivers: net: xgene: Remove return statement from void function
drivers: net: xgene: Configure tx/rx delay for ACPI
drivers: net: xgene: Read tx/rx delay for ACPI
rocker: fix kcalloc parameter order
rds: Fix non-atomic operation on shared flag variable
net: sched: don't use GFP_KERNEL under spin lock
vhost_net: correctly check tx avail during rx busy polling
net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
rxrpc: Make service connection lookup always check for retry
net: stmmac: Delete dead code for MDIO registration
gianfar: Fix Tx flow control deactivation
cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
cxgb4: Fix pause frame count in t4_get_port_stats
cxgb4: fix memory leak
tun: rename generic_xdp to skb_xdp
tun: reserve extra headroom only when XDP is set
net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
net: dsa: bcm_sf2: Advertise number of egress queues
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull writeback error handling updates from Jeff Layton:
"This pile continues the work from last cycle on better tracking
writeback errors. In v4.13 we added some basic errseq_t infrastructure
and converted a few filesystems to use it.
This set continues refining that infrastructure, adds documentation,
and converts most of the other filesystems to use it. The main
exception at this point is the NFS client"
* tag 'wberr-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
ecryptfs: convert to file_write_and_wait in ->fsync
mm: remove optimizations based on i_size in mapping writeback waits
fs: convert a pile of fsync routines to errseq_t based reporting
gfs2: convert to errseq_t based writeback error reporting for fsync
fs: convert sync_file_range to use errseq_t based error-tracking
mm: add file_fdatawait_range and file_write_and_wait
fuse: convert to errseq_t based error tracking for fsync
mm: consolidate dax / non-dax checks for writeback
Documentation: add some docs for errseq_t
errseq: rename __errseq_set to errseq_set
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These include a usual ACPICA code update (this time to upstream
revision 20170728), a fix for a boot crash on some systems with
Thunderbolt devices connected at boot time, a rework of the handling
of PCI bridges when setting up device wakeup, new support for Apple
device properties, support for DMA configurations reported via ACPI on
ARM64, APEI-related updates, ACPI EC driver updates and assorted minor
modifications in several places.
Specifics:
- Update the ACPICA code in the kernel to upstream revision 20170728
including:
* Alias operator handling update (Bob Moore).
* Deferred resolution of reference package elements (Bob Moore).
* Support for the _DMA method in walk resources (Bob Moore).
* Tables handling update and support for deferred table
verification (Lv Zheng).
* Update of SMMU models for IORT (Robin Murphy).
* Compiler and disassembler updates (Alex James, Erik Schmauss,
Ganapatrao Kulkarni, James Morse).
* Tools updates (Erik Schmauss, Lv Zheng).
* Assorted minor fixes and cleanups (Bob Moore, Kees Cook, Lv
Zheng, Shao Ming).
- Rework the initialization of non-wakeup GPEs with method handlers
in order to address a boot crash on some systems with Thunderbolt
devices connected at boot time where we miss an early hotplug event
due to a delay in GPE enabling (Rafael Wysocki).
- Rework the handling of PCI bridges when setting up ACPI-based
device wakeup in order to avoid disabling wakeup for bridges
prematurely (Rafael Wysocki).
- Consolidate Apple DMI checks throughout the tree, add support for
Apple device properties to the device properties framework and use
these properties for the handling of I2C and SPI devices on Apple
systems (Lukas Wunner).
- Add support for _DMA to the ACPI-based device properties lookup
code and make it possible to use the information from there to
configure DMA regions on ARM64 systems (Lorenzo Pieralisi).
- Fix several issues in the APEI code, add support for exporting the
BERT error region over sysfs and update APEI MAINTAINERS entry with
reviewers information (Borislav Petkov, Dongjiu Geng, Loc Ho, Punit
Agrawal, Tony Luck, Yazen Ghannam).
- Fix a potential initialization ordering issue in the ACPI EC driver
and clean it up somewhat (Lv Zheng).
- Update the ACPI SPCR driver to extend the existing XGENE 8250
workaround in it to a new platform (m400) and to work around an
Xgene UART clock issue (Graeme Gregory).
- Add a new utility function to the ACPI core to support using ACPI
OEM ID / OEM Table ID / Revision for system identification in
blacklisting or similar and switch over the existing code already
using this information to this new interface (Toshi Kani).
- Fix an xpower PMIC issue related to GPADC reads that always return
0 without extra pin manipulations (Hans de Goede).
- Add statements to print debug messages in a couple of places in the
ACPI core for easier diagnostics (Rafael Wysocki).
- Clean up the ACPI processor driver slightly (Colin Ian King, Hanjun
Guo).
- Clean up the ACPI x86 boot code somewhat (Andy Shevchenko).
- Add a quirk for Dell OptiPlex 9020M to the ACPI backlight driver
(Alex Hung).
- Assorted fixes, cleanups and updates related to ACPI (Amitoj Kaur
Chawla, Bhumika Goyal, Frank Rowand, Jean Delvare, Punit Agrawal,
Ronald Tschalär, Sumeet Pawnikar)"
* tag 'acpi-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (75 commits)
ACPI / APEI: Suppress message if HEST not present
intel_pstate: convert to use acpi_match_platform_list()
ACPI / blacklist: add acpi_match_platform_list()
ACPI, APEI, EINJ: Subtract any matching Register Region from Trigger resources
ACPI: make device_attribute const
ACPI / sysfs: Extend ACPI sysfs to provide access to boot error region
ACPI: APEI: fix the wrong iteration of generic error status block
ACPI / processor: make function acpi_processor_check_duplicates() static
ACPI / EC: Clean up EC GPE mask flag
ACPI: EC: Fix possible issues related to EC initialization order
ACPI / PM: Add debug statements to acpi_pm_notify_handler()
ACPI: Add debug statements to acpi_global_event_handler()
ACPI / scan: Enable GPEs before scanning the namespace
ACPICA: Make it possible to enable runtime GPEs earlier
ACPICA: Dispatch active GPEs at init time
ACPI: SPCR: work around clock issue on xgene UART
ACPI: SPCR: extend XGENE 8250 workaround to m400
ACPI / LPSS: Don't abort ACPI scan on missing mem resource
mailbox: pcc: Drop uninformative output during boot
ACPI/IORT: Add IORT named component memory address limits
...
|