lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2015-07-20	Linux 3.18.19v3.18.19	Sasha Levin
	Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	x86/xen: allow privcmd hypercalls to be preempted	David Vrabel
	[ Upstream commit db0e2aa47f0a45fc56592c25ad370f8836cbb128 ] Hypercalls submitted by user space tools via the privcmd driver can take a long time (potentially many 10s of seconds) if the hypercall has many sub-operations. A fully preemptible kernel may deschedule such as task in any upcall called from a hypercall continuation. However, in a kernel with voluntary or no preemption, hypercall continuations in Xen allow event handlers to be run but the task issuing the hypercall will not be descheduled until the hypercall is complete and the ioctl returns to user space. These long running tasks may also trigger the kernel's soft lockup detection. Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to bracket hypercalls that may be preempted. Use these in the privcmd driver. When returning from an upcall, call xen_maybe_preempt_hcall() which adds a schedule point if if the current task was within a preemptible hypercall. Since _cond_resched() can move the task to a different CPU, clear and set xen_in_preemptible_hcall around the call. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	rtnl: restore notifications for deleted interfaces	Nicolas Dichtel
	The commit 984ff7a3e060 is an upstream backport. In fact, it depends on commit 395eea6ccf2b ("rtnetlink: delay RTM_DELLINK notification until after ndo_uninit()") which has not been backported in 3.18.y. Before commit 395eea6ccf2b, rollback_registered_many() uses rtmsg_ifinfo(). The call to this function is done with dev->reg_state set to NETREG_UNREGISTERING, thus testing this reg_state in rtmsg_ifinfo() is wrong. This patch partially reverts commit 984ff7a3e060. Fixes: 984ff7a3e060 ("rtnl/bond: don't send rtnl msg for unregistered iface") Reported-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	Input: synaptics - add min/max quirk for Lenovo S540	Peter Hutterer
	[ Upstream commit 7f2ca8b55aeff1fe51ed3570200ef88a96060917 ] https://bugzilla.redhat.com/show_bug.cgi?id=1223051#c2 Cc: stable@vger.kernel.org Tested-by: tommy.gagnes@gmail.com Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	Revert "Input: synaptics - add min/max quirk for Lenovo S540"	Sasha Levin
	This reverts commit e73268af3e137b1f69d0f82d80aa88e0e8c7cbbe. Seems like an incorrect merge on my end, will pick the upstream version again next. Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	Revert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"	Sasha Levin
	This reverts commit ed7f7f145ec1445a130513db9ad8f1547f77a578. Reverting from stable tree as fix was found to be buggy. New fix pending. Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	fs/ufs: restore s_lock mutex_init()	Fabian Frederick
	[ Upstream commit e4f95517f18271b1da36cfc5d700e46844396d6e ] Add last missing line in commit "cdd9eefdf905" ("fs/ufs: restore s_lock mutex") Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	ufs: Fix possible deadlock when looking up directories	Jan Kara
	[ Upstream commit 514d748f69c97a51a2645eb198ac5c6218f22ff9 ] Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) made ufs create inodes with I_NEW flag set. However ufs_mkdir() never cleared this flag. Thus if someone ever tried to lookup the directory by inode number, he would deadlock waiting for I_NEW to be cleared. Luckily this mostly happens only if the filesystem is exported over NFS since otherwise we have the inode attached to dentry and don't look it up by inode number. In rare cases dentry can get freed without inode being freed and then we'd hit the deadlock even without NFS export. Fix the problem by clearing I_NEW before instantiating new directory inode. Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8 Reported-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	ufs: Fix warning from unlock_new_inode()	Jan Kara
	[ Upstream commit 12ecbb4b1d765a5076920999298d9625439dbe58 ] Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) introduced unlock_new_inode() call into ufs_add_nondir(). However that function gets called also from ufs_link() which hands it already initialized inode and thus unlock_new_inode() complains. The problem is harmless but annoying. Fix the problem by opencoding necessary stuff in ufs_link() Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	vfs: Ignore unlocked mounts in fs_fully_visible	Eric W. Biederman
	[ Upstream commit c89d4319ae55186496c43b7a6e510aa1d09dd387 ] commit ceeb0e5d39fcdf4dca2c997bf225c7fc49200b37 upstream. Limit the mounts fs_fully_visible considers to locked mounts. Unlocked can always be unmounted so considering them adds hassle but no security benefit. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	KVM: x86: properly restore LVT0	Radim Krčmář
	[ Upstream commit 58382447b9a9989da551a7b17e72756f6e238bb8 ] commit db1385624c686fe99fe2d1b61a36e1537b915d08 upstream. Legacy NMI watchdog didn't work after migration/resume, because vapics_in_nmi_mode was left at 0. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	KVM: s390: virtio-ccw: don't overwrite config space values	Cornelia Huck
	[ Upstream commit 431dae778aea4eed31bd12e5ee82edc571cd4d70 ] Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi complained about overwriting values in the config space, which was triggered by a broken implementation of virtio-ccw's config get/set routines. It was probably sheer luck that we did not hit this before. When writing a value to the config space, the WRITE_CONF ccw will always write from the beginning of the config space up to and including the value to be set. If the config space up to the value has not yet been retrieved from the device, however, we'll end up overwriting values. Keep track of the known config space and update if needed to avoid this. Moreover, READ_CONF will only read the number of bytes it has been instructed to retrieve, so we must not copy more than that to the buffer, or we might overwrite trailing values. Reported-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Tested-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	selinux: fix setting of security labels on NFS	J. Bruce Fields
	[ Upstream commit 9fc2b4b436cff7d8403034676014f1be9d534942 ] Before calling into the filesystem, vfs_setxattr calls security_inode_setxattr, which ends up calling selinux_inode_setxattr in our case. That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set. SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it only if selinux_is_sblabel_mnt returns true. The selinux_is_sblabel_mnt logic was broken by eadcabc697e9 "SELinux: do all flags twiddling in one place", which didn't take into the account the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs with eb9ae686507b "SELinux: Add new labeling type native labels". This caused setxattr's of security labels over NFSv4.2 to fail. Cc: stable@kernel.org # 3.13 Cc: Eric Paris <eparis@redhat.com> Cc: David Quigley <dpquigl@davequigley.com> Reported-by: Richard Chan <rc556677@outlook.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: added the stable dependency] Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-20	usb: gadget: f_fs: add extra check before unregister_gadget_item	Rui Miguel Silva
	[ Upstream commit f14e9ad17f46051b02bffffac2036486097de19e ] ffs_closed can race with configfs_rmdir which will call config_item_release, so add an extra check to avoid calling the unregister_gadget_item with an null gadget item. Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	perf/x86: Honor the architectural performance monitoring version	Palik, Imre
	[ Upstream commit 2c33645d366d13b969d936b68b9f4875b1fdddea ] Architectural performance monitoring, version 1, doesn't support fixed counters. Currently, even if a hypervisor advertises support for architectural performance monitoring version 1, perf may still try to use the fixed counters, as the constraints are set up based on the CPU model. This patch ensures that perf honors the architectural performance monitoring version returned by CPUID, and it only uses the fixed counters for version 2 and above. (Some of the ideas in this patch came from Peter Zijlstra.) Signed-off-by: Imre Palik <imrep@amazon.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Anthony Liguori <aliguori@amazon.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1433767609-1039-1-git-send-email-imrep.amz@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	perf: Fix ring_buffer_attach() RCU sync, again	Oleg Nesterov
	[ Upstream commit 2f993cf093643b98477c421fa2b9a98dcc940323 ] While looking for other users of get_state/cond_sync. I Found ring_buffer_attach() and it looks obviously buggy? Don't we need to ensure that we have "synchronize" _between_ list_del() and list_add() ? IOW. Suppose that ring_buffer_attach() preempts right_after get_state_synchronize_rcu() and gp completes before spin_lock(). In this case cond_synchronize_rcu() does nothing and we reuse ->rb_entry without waiting for gp in between? It also moves the ->rcu_pending check under "if (rb)", to make it more readable imo. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: der.herr@hofr.at Cc: josh@joshtriplett.org Cc: tj@kernel.org Fixes: b69cf53640da ("perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()") Link: http://lkml.kernel.org/r/20150530200425.GA15748@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	x86/boot: Fix overflow warning with 32-bit binutils	Borislav Petkov
	[ Upstream commit 04c17341b42699a5859a8afa05e64ba08a4e5235 ] When building the kernel with 32-bit binutils built with support only for the i386 target, we get the following warning: arch/x86/kernel/head_32.S:66: Warning: shift count out of range (32 is not between 0 and 31) The problem is that in that case, binutils' internal type representation is 32-bit wide and the shift range overflows. In order to fix this, manipulate the shift expression which creates the 4GiB constant to not overflow the shift count. Suggested-by: Michael Matz <matz@suse.de> Reported-and-tested-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	vfs: Remove incorrect debugging WARN in prepend_path	Eric W. Biederman
	[ Upstream commit 93e3bce6287e1fb3e60d3324ed08555b5bbafa89 ] The warning message in prepend_path is unclear and outdated. It was added as a warning that the mechanism for generating names of pseudo files had been removed from prepend_path and d_dname should be used instead. Unfortunately the warning reads like a general warning, making it unclear what to do with it. Remove the warning. The transition it was added to warn about is long over, and I added code several years ago which in rare cases causes the warning to fire on legitimate code, and the warning is now firing and scaring people for no good reason. Cc: stable@vger.kernel.org Reported-by: Ivan Delalande <colona@arista.com> Reported-by: Omar Sandoval <osandov@osandov.com> Fixes: f48cfddc6729e ("vfs: In d_path don't call d_dname on a mount point") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	KVM: x86: make vapics_in_nmi_mode atomic	Radim Krčmář
	[ Upstream commit 42720138b06301cc8a7ee8a495a6d021c4b6a9bc ] Writes were a bit racy, but hard to turn into a bug at the same time. (Particularly because modern Linux doesn't use this feature anymore.) Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> [Actually the next patch makes it much, much easier to trigger the race so I'm including this one for stable@ as well. - Paolo] Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	arm: KVM: force execution of HCPTR access on VM exit	Marc Zyngier
	[ Upstream commit 85e84ba31039595995dae80b277378213602891b ] On VM entry, we disable access to the VFP registers in order to perform a lazy save/restore of these registers. On VM exit, we restore access, test if we did enable them before, and save/restore the guest/host registers if necessary. In this sequence, the FPEXC register is always accessed, irrespective of the trapping configuration. If the guest didn't touch the VFP registers, then the HCPTR access has now enabled such access, but we're missing a barrier to ensure architectural execution of the new HCPTR configuration. If the HCPTR access has been delayed/reordered, the subsequent access to FPEXC will cause a trap, which we aren't prepared to handle at all. The same condition exists when trapping to enable VFP for the guest. The fix is to introduce a barrier after enabling VFP access. In the vmexit case, it can be relaxed to only takes place if the guest hasn't accessed its view of the VFP registers, making the access to FPEXC safe. The set_hcptr macro is modified to deal with both vmenter/vmexit and vmtrap operations, and now takes an optional label that is branched to when the guest hasn't touched the VFP registers. Reported-by: Vikram Sethi <vikrams@codeaurora.org> Cc: stable@kernel.org # v3.9+ Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	iommu/amd: Handle large pages correctly in free_pagetable	Joerg Roedel
	[ Upstream commit 0b3fff54bc01e8e6064d222a33e6fa7adabd94cd ] Make sure that we are skipping over large PTEs while walking the page-table tree. Cc: stable@kernel.org Fixes: 5c34c403b723 ("iommu/amd: Fix memory leak in free_pagetable") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	arm64/kvm: Fix assembler compatibility of macros	Geoff Levand
	[ Upstream commit 286fb1cc32b11c18da3573a8c8c37a4f9da16e30 ] Some of the macros defined in kvm_arm.h are useful in assembly files, but are not compatible with the assembler. Change any C language integer constant definitions using appended U, UL, or ULL to the UL() preprocessor macro. Also, add a preprocessor include of the asm/memory.h file which defines the UL() macro. Fixes build errors like these when using kvm_arm.h in assembly source files: Error: unexpected characters following instruction at operand 3 -- `and x0,x1,#((1U<<25)-1)' Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	KVM: nSVM: Check for NRIPS support before updating control field	Bandan Das
	[ Upstream commit f104765b4f81fd74d69e0eb161e89096deade2db ] If hardware doesn't support DecodeAssist - a feature that provides more information about the intercept in the VMCB, KVM decodes the instruction and then updates the next_rip vmcb control field. However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS]. Since skip_emulated_instruction() doesn't verify nrip support before accepting control.next_rip as valid, avoid writing this field if support isn't present. Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	ARM: clk-imx6q: refine sata's parent	Sebastien Szymanski
	[ Upstream commit 3a3ac04f0e0ef48c2d30ae6df7b8906421c2b35d ] commit da946aeaeadcd24ff0cda9984c6fb8ed2bfd462a upstream. According to IMX6D/Q RM, table 18-3, sata clock's parent is ahb, not ipg. Signed-off-by: Sebastien Szymanski <sebastien.szymanski@armadeus.com> Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> [dirk.behme: Adjust moved file] Signed-off-by: Dirk Behme <dirk.behme@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	Btrfs: make xattr replace operations atomic	Filipe Manana
	[ Upstream commit 02590fd855d1690568b2fa439c942e933221b57a ] commit 5f5bc6b1e2d5a6f827bc860ef2dc5b6f365d1339 upstream. Replacing a xattr consists of doing a lookup for its existing value, delete the current value from the respective leaf, release the search path and then finally insert the new value. This leaves a time window where readers (getxattr, listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs, so this has security implications. This change also fixes 2 other existing issues which were: ) Deleting the old xattr value without verifying first if the new xattr will fit in the existing leaf item (in case multiple xattrs are packed in the same item due to name hash collision); ) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't exist but we have have an existing item that packs muliple xattrs with the same name hash as the input xattr. In this case we should return ENOSPC. A test case for xfstests follows soon. Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace implementation. Reported-by: Alexandre Oliva <oliva@gnu.org> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	x86/microcode/intel: Guard against stack overflow in the loader	Quentin Casasnovas
	[ Upstream commit f84598bd7c851f8b0bf8cd0d7c3be0d73c432ff4 ] mc_saved_tmp is a static array allocated on the stack, we need to make sure mc_saved_count stays within its bounds, otherwise we're overflowing the stack in _save_mc(). A specially crafted microcode header could lead to a kernel crash or potentially kernel execution. Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1422964824-22056-1-git-send-email-quentin.casasnovas@oracle.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	netfilter: nf_tables: allow to change chain policy without hook if it exists	Pablo Neira Ayuso
	[ Upstream commit d6b6cb1d3e6f78d55c2d4043d77d0d8def3f3b99 ] If there's an existing base chain, we have to allow to change the default policy without indicating the hook information. However, if the chain doesn't exists, we have to enforce the presence of the hook attribute. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set	Pablo Neira Ayuso
	[ Upstream commit 749177ccc74f9c6d0f51bd78a15c652a2134aa11 ] ip6tables extensions check for this flag to restrict match/target to a given protocol. Without this flag set, SYNPROXY6 returns an error. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()	Ian Wilson
	[ Upstream commit 78146572b9cd20452da47951812f35b1ad4906be ] nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(), nfnl_cthelper_get() and nfnl_cthelper_del(). In each case they pass a pointer to an nf_conntrack_tuple data structure local variable: struct nf_conntrack_tuple tuple; ... ret = nfnl_cthelper_parse_tuple(&tuple, tb[NFCTH_TUPLE]); The problem is that this local variable is not initialized, and nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and dst.protonum. This leaves all other fields with undefined values based on whatever is on the stack: tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM])); tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]); The symptom observed was that when the rpc and tns helpers were added then traffic to port 1536 was being sent to user-space. Signed-off-by: Ian Wilson <iwilson@brocade.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	sb_edac: Fix erroneous bytes->gigabytes conversion	Jim Snow
	[ Upstream commit ece9210859abb2cc0126f8adbfbbdee668d4906a ] commit 8c009100295597f23978c224aec5751a365bc965 upstream. Signed-off-by: Jim Snow <jim.snow@intel.com> Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> [ vlee: Backported to 3.14. Adjusted context. ] Signed-off-by: Vinson Lee <vlee@twitter.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	tracing: Have filter check for balanced ops	Steven Rostedt
	[ Upstream commit 2cf30dc180cea808077f003c5116388183e54f9e ] When the following filter is used it causes a warning to trigger: # cd /sys/kernel/debug/tracing # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter -bash: echo: write error: Invalid argument # cat events/ext4/ext4_truncate_exit/filter ((dev==1)blocks==2) ^ parse_error: No error ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990() Modules linked in: bnep lockd grace bluetooth ... CPU: 3 PID: 1223 Comm: bash Tainted: G W 4.1.0-rc3-test+ #450 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0 0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea Call Trace: [<ffffffff816ed4f9>] dump_stack+0x4c/0x6e [<ffffffff8107fb07>] warn_slowpath_common+0x97/0xe0 [<ffffffff8136b46c>] ? _kstrtoull+0x2c/0x80 [<ffffffff8107fb6a>] warn_slowpath_null+0x1a/0x20 [<ffffffff81159065>] replace_preds+0x3c5/0x990 [<ffffffff811596b2>] create_filter+0x82/0xb0 [<ffffffff81159944>] apply_event_filter+0xd4/0x180 [<ffffffff81152bbf>] event_filter_write+0x8f/0x120 [<ffffffff811db2a8>] __vfs_write+0x28/0xe0 [<ffffffff811dda43>] ? __sb_start_write+0x53/0xf0 [<ffffffff812e51e0>] ? security_file_permission+0x30/0xc0 [<ffffffff811dc408>] vfs_write+0xb8/0x1b0 [<ffffffff811dc72f>] SyS_write+0x4f/0xb0 [<ffffffff816f5217>] system_call_fastpath+0x12/0x6a ---[ end trace e11028bd95818dcd ]--- Worse yet, reading the error message (the filter again) it says that there was no error, when there clearly was. The issue is that the code that checks the input does not check for balanced ops. That is, having an op between a closed parenthesis and the next token. This would only cause a warning, and fail out before doing any real harm, but it should still not caues a warning, and the error reported should work: # cd /sys/kernel/debug/tracing # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter -bash: echo: write error: Invalid argument # cat events/ext4/ext4_truncate_exit/filter ((dev==1)blocks==2) ^ parse_error: Meaningless filter expression And give no kernel warning. Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: stable@vger.kernel.org # 2.6.31+ Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-13	fs: Fix S_NOSEC handling	Jan Kara
	[ Upstream commit 2426f3910069ed47c0cc58559a6d088af7920201 ] file_remove_suid() could mistakenly set S_NOSEC inode bit when root was modifying the file. As a result following writes to the file by ordinary user would avoid clearing suid or sgid bits. Fix the bug by checking actual mode bits before setting S_NOSEC. CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	can: fix loss of CAN frames in raw_rcv	Oliver Hartkopp
	[ Upstream commit 36c01245eb8046c16eee6431e7dbfbb302635fa8 ] As reported by Manfred Schlaegl here http://marc.info/?l=linux-netdev&m=143482089824232&w=2 commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" requires the skb->tstamp to be set to check for identical CAN skbs. As net timestamping is influenced by several players (netstamp_needed and netdev_tstamp_prequeue) Manfred missed a proper timestamp which leads to CAN frame loss. As skb timestamping became now mandatory for CAN related skbs this patch makes sure that received CAN skbs always have a proper timestamp set. Maybe there's a better solution in the future but this patch fixes the CAN frame loss so far. Reported-by: Manfred Schlaegl <manfred.schlaegl@gmx.at> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	fs/ufs: restore s_lock mutex	Fabian Frederick
	[ Upstream commit cdd9eefdf905e92e7fc6cc393314efe68dc6ff66 ] Commit 0244756edc4b98c ("ufs: sb mutex merge + mutex_destroy") generated deadlocks in read/write mode on mkdir. This patch partially reverts it keeping fixes by Andrew Morton and mutex_destroy() [AV: fixed a missing bit in ufs_remount()] Signed-off-by: Fabian Frederick <fabf@skynet.be> Reported-by: Ian Campbell <ian.campbell@citrix.com> Suggested-by: Jan Kara <jack@suse.cz> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Roger Pau Monne <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"	Fabian Frederick
	[ Upstream commit 13b987ea275840d74d9df9a44326632fab1894da ] This reverts commit 9ef7db7f38d0 ("ufs: fix deadlocks introduced by sb mutex merge") That patch tried to solve commit 0244756edc4b98c ("ufs: sb mutex merge + mutex_destroy") which is itself partially reverted due to multiple deadlocks. Signed-off-by: Fabian Frederick <fabf@skynet.be> Suggested-by: Jan Kara <jack@suse.cz> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Roger Pau Monne <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	netfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings	Chen Gang
	[ Upstream commit b18c5d15e8714336365d9d51782d5b53afa0443c ] The related code can be simplified, and also can avoid related warnings (with allmodconfig under parisc): CC [M] net/netfilter/nfnetlink_cthelper.o net/netfilter/nfnetlink_cthelper.c: In function ‘nfnl_cthelper_from_nlattr’: net/netfilter/nfnetlink_cthelper.c:97:9: warning: passing argument 1 o ‘memcpy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers] memcpy(&help->data, nla_data(attr), help->helper->data_len); ^ In file included from include/linux/string.h:17:0, from include/uapi/linux/uuid.h:25, from include/linux/uuid.h:23, from include/linux/mod_devicetable.h:12, from ./arch/parisc/include/asm/hardware.h:4, from ./arch/parisc/include/asm/processor.h:15, from ./arch/parisc/include/asm/spinlock.h:6, from ./arch/parisc/include/asm/atomic.h:21, from include/linux/atomic.h:4, from ./arch/parisc/include/asm/bitops.h:12, from include/linux/bitops.h:36, from include/linux/kernel.h:10, from include/linux/list.h:8, from include/linux/module.h:9, from net/netfilter/nfnetlink_cthelper.c:11: ./arch/parisc/include/asm/string.h:8:8: note: expected ‘void ’ but argument is of type ‘const char ()[]’ void * memcpy(void * dest,const void *src,size_t count); ^ Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@soleta.eu> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected	Konrad Rzeszutek Wilk
	[ Upstream commit a6dfa128ce5c414ab46b1d690f7a1b8decb8526d ] A huge amount of NIC drivers use the DMA API, however if compiled under 32-bit an very important part of the DMA API can be ommitted leading to the drivers not working at all (especially if used with 'swiotlb=force iommu=soft'). As Prashant Sreedharan explains it: "the driver [tg3] uses DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma "mapping" and dma_unmap_addr() to get the "mapping" value. On most of the platforms this is a no-op, but ... with "iommu=soft and swiotlb=force" this house keeping is required, ... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the DMA address." As such enable this even when using 32-bit kernels. Reported-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Prashant Sreedharan <prashant@broadcom.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Chan <mchan@broadcom.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: cascardo@linux.vnet.ibm.com Cc: david.vrabel@citrix.com Cc: sanjeevb@broadcom.com Cc: siva.kallam@broadcom.com Cc: vyasevich@gmail.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	kprobes/x86: Return correct length in __copy_instruction()	Eugene Shatokhin
	[ Upstream commit c80e5c0c23ce2282476fdc64c4b5e3d3a40723fd ] On x86-64, __copy_instruction() always returns 0 (error) if the instruction uses %rip-relative addressing. This is because kernel_insn_init() is called the second time for 'insn' instance in such cases and sets all its fields to 0. Because of this, trying to place a kprobe on such instruction will fail, register_kprobe() will return -EINVAL. This patch fixes the problem. Signed-off-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20150317100918.28349.94654.stgit@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	mnt: Modify fs_fully_visible to deal with locked ro nodev and atime	Eric W. Biederman
	[ Upstream commit 8c6cf9cc829fcd0b179b59f7fe288941d0e31108 ] Ignore an existing mount if the locked readonly, nodev or atime attributes are less permissive than the desired attributes of the new mount. On success ensure the new mount locks all of the same readonly, nodev and atime attributes as the old mount. The nosuid and noexec attributes are not checked here as this change is destined for stable and enforcing those attributes causes a regression in lxc and libvirt-lxc where those applications will not start and there are no known executables on sysfs or proc and no known way to create exectuables without code modifications Cc: stable@vger.kernel.org Fixes: e51db73532955 ("userns: Better restrictions on when proc and sysfs can be mounted") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	tty/serial: at91: RS485 mode: 0 is valid for delay_rts_after_send	Nicolas Ferre
	[ Upstream commit 8687634b7908c42eb700e0469e110e02833611d1 ] In RS485 mode, we may want to set the delay_rts_after_send value to 0. In the datasheet, the 0 value is said to "disable" the Transmitter Timeguard but this is exactly the expected behavior if we want no delay... Moreover, if the value was set to non-zero value by device-tree or earlier ioctl command, it was impossible to change it back to zero. Reported-by: Sami Pietikäinen <Sami.Pietikainen@wapice.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: stable@vger.kernel.org # 3.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-12	mnt: Refactor the logic for mounting sysfs and proc in a user namespace	Eric W. Biederman
	[ Upstream commit 1b852bceb0d111e510d1a15826ecc4a19358d512 ] Fresh mounts of proc and sysfs are a very special case that works very much like a bind mount. Unfortunately the current structure can not preserve the MNT_LOCK... mount flags. Therefore refactor the logic into a form that can be modified to preserve those lock bits. Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount of the filesystem be fully visible in the current mount namespace, before the filesystem may be mounted. Move the logic for calling fs_fully_visible from proc and sysfs into fs/namespace.c where it has greater access to mount namespace state. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-09	Linux 3.18.18v3.18.18	Sasha Levin
	Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	mmc: sdhci-pxav3: do the mbus window configuration after enabling clocks	Thomas Petazzoni
	[ upstream commit aa8165f914420f143476305a01894b017d3abe6b ] In commit 5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada 38x SDHCI controller"), the sdhci-pxav3 driver was extended to include support for the SDHCI controller found in the Armada 38x processor. This mainly involved adding some MBus window related configuration. However, this configuration is currently done too early in ->probe(): it is done before clocks are enabled, while this configuration involves touching the registers of the controller, which will hang the SoC if the clock is disabled. It wasn't noticed until now because the bootloader typically leaves gatable clocks enabled, but in situations where we have a deferred probe (due to a CD GPIO that cannot be taken, for example), then the probe will be re-tried later, after a clock disable has been done in the exit path of the failed probe attempt of the device. This second probe() will hang the system due to the clock being disabled. This can for example be produced on Armada 385 GP, which has a CD GPIO connected to an I2C PCA9555. If the driver for the PCA9555 is not compiled into the kernel, then we will have the following sequence of events: 1. The SDHCI probes 2. It does the MBus configuration (which works, because the clock is left enabled by the bootloader) 3. It enables the clock 4. It tries to get the CD GPIO, which fails due to the driver being missing, so -EPROBE_DEFER is returned. 5. Before returning -EPROBE_DEFER, the driver cleans up what was done, which includes disabling the clock. 6. Later on, the SDHCI probe is tried again. 7. It does the MBus configuration, which hangs because the clock is no longer enabled. This commit does the obvious fix of doing the MBus configuration after the clock has been enabled by the driver. Fixes: 5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada 38x SDHCI controller") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> [jogo: rebased onto 3.18.17] Signed-off-by: Jonas Gorski <jogo@openwrt.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	sctp: Fix race between OOTB responce and route removal	Alexander Sverdlin
	[ Upstream commit 29c4afc4e98f4dc0ea9df22c631841f9c220b944 ] There is NULL pointer dereference possible during statistics update if the route used for OOTB responce is removed at unfortunate time. If the route exists when we receive OOTB packet and we finally jump into sctp_packet_transmit() to send ABORT, but in the meantime route is removed under our feet, we take "no_route" path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...). But sctp_ootb_pkt_new() used to prepare responce packet doesn't call sctp_transport_set_owner() and therefore there is no asoc associated with this packet. Probably temporary asoc just for OOTB responces is overkill, so just introduce a check like in all other places in sctp_packet_transmit(), where "asoc" is dereferenced. To reproduce this, one needs to 0. ensure that sctp module is loaded (otherwise ABORT is not generated) 1. remove default route on the machine 2. while true; do ip route del [interface-specific route] ip route add [interface-specific route] done 3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT responce On x86_64 the crash looks like this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] PGD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ... CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.0.5-1-ARCH #1 Hardware name: ... task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000 RIP: 0010:[<ffffffffa05ec9ac>] [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] RSP: 0018:ffff880127c037b8 EFLAGS: 00010296 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480 RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700 RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28 R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0 FS: 0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0 Stack: ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400 ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520 0000000000000000 0000000000000001 0000000000000000 0000000000000000 Call Trace: <IRQ> [<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp] [<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp] [<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp] [<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210 [<ffffffff810e0329>] ? update_process_times+0x59/0x60 [<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0 [<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0 [<ffffffff8101f599>] ? read_tsc+0x9/0x10 [<ffffffff8116d4b5>] ? put_page+0x55/0x60 [<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100 [<ffffffff81462b68>] ? skb_free_head+0x58/0x80 [<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic] [<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0 [<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp] [<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp] [<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp] [<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp] [<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp] [<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30 [<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150 [<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210 [<ffffffff814b2e15>] ip_local_deliver+0x35/0x90 [<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370 [<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0 [<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50 [<ffffffff81476c28>] __netif_receive_skb+0x18/0x60 [<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0 [<ffffffff814776c8>] napi_gro_receive+0xe8/0x120 [<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169] [<ffffffff8147896a>] net_rx_action+0x21a/0x360 [<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0 [<ffffffff8107912d>] irq_exit+0xad/0xb0 [<ffffffff8157d158>] do_IRQ+0x58/0xf0 [<ffffffff8157b06d>] common_interrupt+0x6d/0x6d <EOI> [<ffffffff810e1218>] ? hrtimer_start+0x18/0x20 [<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp] [<ffffffff81020c50>] ? mwait_idle+0x60/0xa0 [<ffffffff810216ef>] arch_cpu_idle+0xf/0x20 [<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480 [<ffffffff8156b365>] rest_init+0x85/0x90 [<ffffffff818eb035>] start_kernel+0x48b/0x4ac [<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120 [<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c [<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184 Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9 RIP [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] RSP <ffff880127c037b8> CR2: 0000000000000020 ---[ end trace 5aec7fd2dc983574 ]--- Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) drm_kms_helper: panic occurred, switching back to text console ---[ end Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	bnx2x: fix lockdep splat	Eric Dumazet
	[ Upstream commit d53c66a5b80698620f7c9ba2372fff4017e987b8 ] Michel reported following lockdep splat [ 44.718117] INFO: trying to register non-static key. [ 44.723081] the code is fine but needs lockdep annotation. [ 44.728559] turning off the locking correctness validator. [ 44.734036] CPU: 8 PID: 5483 Comm: ethtool Not tainted 4.1.0 [ 44.770289] Call Trace: [ 44.772741] [<ffffffff816eb1cd>] dump_stack+0x4c/0x65 [ 44.777879] [<ffffffff8111d921>] ? console_unlock+0x1f1/0x510 [ 44.783708] [<ffffffff811121f5>] __lock_acquire+0x1d05/0x1f10 [ 44.789538] [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90 [ 44.795276] [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0 [ 44.801967] [<ffffffff8111390d>] ? trace_hardirqs_on+0xd/0x10 [ 44.807793] [<ffffffff811330fa>] ? hrtimer_try_to_cancel+0x4a/0x250 [ 44.814142] [<ffffffff81112ba6>] lock_acquire+0xb6/0x290 [ 44.819537] [<ffffffff810d6675>] ? flush_work+0x5/0x280 [ 44.824844] [<ffffffff810d66ad>] flush_work+0x3d/0x280 [ 44.830061] [<ffffffff810d6675>] ? flush_work+0x5/0x280 [ 44.835366] [<ffffffff816f3c43>] ? schedule_hrtimeout_range+0x13/0x20 [ 44.841889] [<ffffffff8112ec9b>] ? usleep_range+0x4b/0x50 [ 44.847365] [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90 [ 44.853102] [<ffffffff810d8585>] ? __cancel_work_timer+0x105/0x1c0 [ 44.859359] [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0 [ 44.866045] [<ffffffff810d851f>] __cancel_work_timer+0x9f/0x1c0 [ 44.872048] [<ffffffffa0010982>] ? bnx2x_func_stop+0x42/0x90 [bnx2x] [ 44.878481] [<ffffffff810d8670>] cancel_work_sync+0x10/0x20 [ 44.884134] [<ffffffffa00259e5>] bnx2x_chip_cleanup+0x245/0x730 [bnx2x] [ 44.890829] [<ffffffff8110ce02>] ? up+0x32/0x50 [ 44.895439] [<ffffffff811306b5>] ? del_timer_sync+0x5/0xd0 [ 44.901005] [<ffffffffa005596d>] bnx2x_nic_unload+0x20d/0x8e0 [bnx2x] [ 44.907527] [<ffffffff811f1aef>] ? might_fault+0x5f/0xb0 [ 44.912921] [<ffffffffa005851c>] bnx2x_reload_if_running+0x2c/0x50 [bnx2x] [ 44.919879] [<ffffffffa005a3c5>] bnx2x_set_ringparam+0x2b5/0x460 [bnx2x] [ 44.926664] [<ffffffff815d498b>] dev_ethtool+0x55b/0x1c40 [ 44.932148] [<ffffffff815dfdc7>] ? rtnl_lock+0x17/0x20 [ 44.937364] [<ffffffff815e7f8b>] dev_ioctl+0x17b/0x630 [ 44.942582] [<ffffffff815abf8d>] sock_do_ioctl+0x5d/0x70 [ 44.947972] [<ffffffff815ac013>] sock_ioctl+0x73/0x280 [ 44.953192] [<ffffffff8124c1c8>] do_vfs_ioctl+0x88/0x5b0 [ 44.958587] [<ffffffff8110d0b3>] ? up_read+0x23/0x40 [ 44.963631] [<ffffffff812584cc>] ? __fget_light+0x6c/0xa0 [ 44.969105] [<ffffffff8124c781>] SyS_ioctl+0x91/0xb0 [ 44.974149] [<ffffffff816f4dd7>] system_call_fastpath+0x12/0x6f As bnx2x_init_ptp() is only called if bp->flags contains PTP_SUPPORTED, we also need to guard bnx2x_stop_ptp() with same condition, otherwise ptp_task workqueue is not initialized and kernel barfs on cancel_work_sync() Fixes: eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support") Reported-by: Michel Lespinasse <walken@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michal Kalderon <Michal.Kalderon@qlogic.com> Cc: Ariel Elior <Ariel.Elior@qlogic.com> Cc: Yuval Mintz <Yuval.Mintz@qlogic.com> Cc: David Decotigny <decot@google.com> Acked-by: Sony Chacko <sony.chacko@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	net: phy: fix phy link up when limiting speed via device tree	Mugunthan V N
	[ Upstream commit eb686231fce3770299760f24fdcf5ad041f44153 ] When limiting phy link speed using "max-speed" to 100mbps or less on a giga bit phy, phy never completes auto negotiation and phy state machine is held in PHY_AN. Fixing this issue by comparing the giga bit advertise though phydev->supported doesn't have it but phy has BMSR_ESTATEN set. So that auto negotiation is restarted as old and new advertise are different and link comes up fine. Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	net/mlx4_en: Wake TX queues only when there's enough room	Ido Shamay
	[ Upstream commit 488a9b48e398b157703766e2cd91ea45ac6997c5 ] Indication of a single completed packet, marked by txbbs_skipped being bigger then zero, in not enough in order to wake up a stopped TX queue. The completed packet may contain a single TXBB, while next packet to be sent (after the wake up) may have multiple TXBBs (LSO/TSO packets for example), causing overflow in queue followed by WQE corruption and TX queue timeout. Instead, wake the stopped queue only when there's enough room for the worst case (maximum sized WQE) packet that we should need to handle after the queue is opened again. Also created an helper routine - mlx4_en_is_tx_ring_full, which checks if the current TX ring is full or not. It provides better code readability and removes code duplication. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	tcp: Do not call tcp_fastopen_reset_cipher from interrupt context	Christoph Paasch
	[ Upstream commit dfea2aa654243f70dc53b8648d0bbdeec55a7df1 ] tcp_fastopen_reset_cipher really cannot be called from interrupt context. It allocates the tcp_fastopen_context with GFP_KERNEL and calls crypto_alloc_cipher, which allocates all kind of stuff with GFP_KERNEL. Thus, we might sleep when the key-generation is triggered by an incoming TFO cookie-request which would then happen in interrupt- context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP: [ 36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266 [ 36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill [ 36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14 [ 36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 36.008250] 00000000000004f2 ffff88007f8838a8 ffffffff8171d53a ffff880075a084a8 [ 36.009630] ffff880075a08000 ffff88007f8838c8 ffffffff810967d3 ffff88007f883928 [ 36.011076] 0000000000000000 ffff88007f8838f8 ffffffff81096892 ffff88007f89be00 [ 36.012494] Call Trace: [ 36.012953] <IRQ> [<ffffffff8171d53a>] dump_stack+0x4f/0x6d [ 36.014085] [<ffffffff810967d3>] ___might_sleep+0x103/0x170 [ 36.015117] [<ffffffff81096892>] __might_sleep+0x52/0x90 [ 36.016117] [<ffffffff8118e887>] kmem_cache_alloc_trace+0x47/0x190 [ 36.017266] [<ffffffff81680d82>] ? tcp_fastopen_reset_cipher+0x42/0x130 [ 36.018485] [<ffffffff81680d82>] tcp_fastopen_reset_cipher+0x42/0x130 [ 36.019679] [<ffffffff81680f01>] tcp_fastopen_init_key_once+0x61/0x70 [ 36.020884] [<ffffffff81680f2c>] __tcp_fastopen_cookie_gen+0x1c/0x60 [ 36.022058] [<ffffffff816814ff>] tcp_try_fastopen+0x58f/0x730 [ 36.023118] [<ffffffff81671788>] tcp_conn_request+0x3e8/0x7b0 [ 36.024185] [<ffffffff810e3872>] ? __module_text_address+0x12/0x60 [ 36.025327] [<ffffffff8167b2e1>] tcp_v4_conn_request+0x51/0x60 [ 36.026410] [<ffffffff816727e0>] tcp_rcv_state_process+0x190/0xda0 [ 36.027556] [<ffffffff81661f97>] ? __inet_lookup_established+0x47/0x170 [ 36.028784] [<ffffffff8167c2ad>] tcp_v4_do_rcv+0x16d/0x3d0 [ 36.029832] [<ffffffff812e6806>] ? security_sock_rcv_skb+0x16/0x20 [ 36.030936] [<ffffffff8167cc8a>] tcp_v4_rcv+0x77a/0x7b0 [ 36.031875] [<ffffffff816af8c3>] ? iptable_filter_hook+0x33/0x70 [ 36.032953] [<ffffffff81657d22>] ip_local_deliver_finish+0x92/0x1f0 [ 36.034065] [<ffffffff81657f1a>] ip_local_deliver+0x9a/0xb0 [ 36.035069] [<ffffffff81657c90>] ? ip_rcv+0x3d0/0x3d0 [ 36.035963] [<ffffffff81657569>] ip_rcv_finish+0x119/0x330 [ 36.036950] [<ffffffff81657ba7>] ip_rcv+0x2e7/0x3d0 [ 36.037847] [<ffffffff81610652>] __netif_receive_skb_core+0x552/0x930 [ 36.038994] [<ffffffff81610a57>] __netif_receive_skb+0x27/0x70 [ 36.040033] [<ffffffff81610b72>] process_backlog+0xd2/0x1f0 [ 36.041025] [<ffffffff81611482>] net_rx_action+0x122/0x310 [ 36.042007] [<ffffffff81076743>] __do_softirq+0x103/0x2f0 [ 36.042978] [<ffffffff81723e3c>] do_softirq_own_stack+0x1c/0x30 This patch moves the call to tcp_fastopen_init_key_once to the places where a listener socket creates its TFO-state, which always happens in user-context (either from the setsockopt, or implicitly during the listen()-call) Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Fixes: 222e83d2e0ae ("tcp: switch tcp_fastopen key generation to net_get_random_once") Signed-off-by: Christoph Paasch <cpaasch@apple.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	neigh: do not modify unlinked entries	Julian Anastasov
	[ Upstream commit 2c51a97f76d20ebf1f50fef908b986cb051fdff9 ] The lockless lookups can return entry that is unlinked. Sometimes they get reference before last neigh_cleanup_and_release, sometimes they do not need reference. Later, any modification attempts may result in the following problems: 1. entry is not destroyed immediately because neigh_update can start the timer for dead entry, eg. on change to NUD_REACHABLE state. As result, entry lives for some time but is invisible and out of control. 2. __neigh_event_send can run in parallel with neigh_destroy while refcnt=0 but if timer is started and expired refcnt can reach 0 for second time leading to second neigh_destroy and possible crash. Thanks to Eric Dumazet and Ying Xue for their work and analyze on the __neigh_event_send change. Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour") Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.") Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().") Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-07-05	packet: avoid out of bounds read in round robin fanout	Willem de Bruijn
	[ Upstream commit 468479e6043c84f5a65299cc07cb08a22a28c2b1 ] PACKET_FANOUT_LB computes f->rr_cur such that it is modulo f->num_members. It returns the old value unconditionally, but f->num_members may have changed since the last store. Ensure that the return value is always < num. When modifying the logic, simplify it further by replacing the loop with an unconditional atomic increment. Fixes: dc99f600698d ("packet: Add fanout support.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>