lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2015-04-11	coda: switch to ->read_iter/->write_iter	Al Viro
	... and request the same from the local cache - all filesystems with anything usable for that support those already. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	ncpfs: switch to ->read_iter/->write_iter	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	nommu: use __vfs_read()	Al Viro
	... instead of open-coding the call of ->read() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	acct: check FMODE_CAN_WRITE	Al Viro
	it's not calling ->write() directly anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	aio_run_iocb(): kill dead check	Al Viro
	We check if ->ki_pos is positive. However, by that point we have already done rw_verify_area(), which would have rejected such unless the file had been one of /dev/mem, /dev/kmem and /proc/kcore. All of which do not have vectored rw methods, so we would've bailed out even earlier. This check had been introduced before rw_verify_area() had been added there - in fact, it was a subset of checks done on sync paths by rw_verify_area() (back then the /dev/mem exception didn't exist at all). The rest of checks (mandatory locking, etc.) hadn't been added until later. Unfortunately, by the time the call of rw_verify_area() got added, the /dev/mem exception had already appeared, so it wasn't obvious that the older explicit check downstream had become dead code. It is a dead code, though, since the few files for which the exception applies do not have ->aio_{read,write}() or ->{read,write}_iter() and for them we won't reach that check anyway. What's more, even if we ever introduce vectored methods for /dev/mem and friends, they'll have to cope with negative positions anyway, since readv(2) and writev(2) are using the same checks as read(2) and write(2) - i.e. rw_verify_area(). Let's bury it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	ioctx_alloc(): remove pointless check	Al Viro
	Way, way back kiocb used to be picked from arrays, so ioctx_alloc() checked for multiplication overflow when calculating the size of such array. By the time fs/aio.c went into the tree (in 2002) they were already allocated one-by-one by kmem_cache_alloc(), so that check had already become pointless. Let's bury it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	lustre: kill unused members of struct vvp_thread_info	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	expand __fuse_direct_write() in both callers	Al Viro
	it's actually shorter that way and later we'll want iocb in scope of generic_write_check() caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	fuse: switch fuse_direct_io_file_operations to ->{read,write}_iter()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	cuse: switch to iov_iter	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	Merge branch 'for-davem' into for-next	Al Viro

2015-04-11	sg_start_req(): use import_iovec()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	sg_start_req(): make sure that there's not too many elements in iovec	Al Viro
	unfortunately, allowing an arbitrary 16bit value means a possibility of overflow in the calculation of total number of pages in bio_map_user_iov() - we rely on there being no more than PAGE_SIZE members of sum in the first loop there. If that sum wraps around, we end up allocating too small array of pointers to pages and it's easy to overflow it in the second loop. X-Coverup: TINC (and there's no lumber cartel either) Cc: stable@vger.kernel.org # way, way back Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	blk_rq_map_user(): use import_single_range()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	sg_io(): use import_iovec()	Al Viro
	... and don't skip access_ok() validation. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	process_vm_access: switch to {compat_,}import_iovec()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	switch keyctl_instantiate_key_common() to iov_iter	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	switch {compat_,}do_readv_writev() to {compat_,}import_iovec()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	aio_setup_vectored_rw(): switch to {compat_,}import_iovec()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	vmsplice_to_user(): switch to import_iovec()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	kill aio_setup_single_vector()	Al Viro
	identical to import_single_range() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	Merge branch 'iov_iter' into for-next	Al Viro

2015-04-11	aio: simplify arguments of aio_setup_..._rw()	Al Viro
	We don't need req in either of those. We don't need nr_segs in caller. We don't really need len in caller either - iov_iter_count(&iter) will do. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	aio: lift iov_iter_init() into aio_setup_..._rw()	Al Viro
	the only non-trivial detail is that we do it before rw_verify_area(), so we'd better cap the length ourselves in aio_setup_single_rw() case (for vectored case rw_copy_check_uvector() will do that for us). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	lift iov_iter into {compat_,}do_readv_writev()	Al Viro
	get it closer to matching {compat_,}rw_copy_check_uvector(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	Merge branch 'iocb' into for-next	Al Viro

2015-04-11	NFS: fix BUG() crash in notify_change() with patch to chown_common()	Andrew Elble
	We have observed a BUG() crash in fs/attr.c:notify_change(). The crash occurs during an rsync into a filesystem that is exported via NFS. 1.) fs/attr.c:notify_change() modifies the caller's version of attr. 2.) 6de0ec00ba8d ("VFS: make notify_change pass ATTR_KILL_SID to setattr operations") introduced a BUG() restriction such that "no function will ever call notify_change() with both ATTR_MODE and ATTR_KILL_SID set". Under some circumstances though, it will have assisted in setting the caller's version of attr to this very combination. 3.) 27ac0ffeac80 ("locks: break delegations on any attribute modification") introduced code to handle breaking delegations. This can result in notify_change() being re-called. attr _must_ be explicitly reset to avoid triggering the BUG() established in #2. 4.) The path that that triggers this is via fs/open.c:chmod_common(). The combination of attr flags set here and in the first call to notify_change() along with a later failed break_deleg_wait() results in notify_change() being called again via retry_deleg without resetting attr. Solution is to move retry_deleg in chmod_common() a bit further up to ensure attr is completely reset. There are other places where this seemingly could occur, such as fs/utimes.c:utimes_common(), but the attr flags are not initially set in such a way to trigger this. Fixes: 27ac0ffeac80 ("locks: break delegations on any attribute modification") Reported-by: Eric Meddaugh <etmsys@rit.edu> Tested-by: Eric Meddaugh <etmsys@rit.edu> Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	dcache: return -ESTALE not -EBUSY on distributed fs race	J. Bruce Fields
	On a distributed filesystem it's possible for lookup to discover that a directory it just found is already cached elsewhere in the directory heirarchy. The dcache won't let us keep the directory in both places, so we have to move the dentry to the new location from the place we previously had it cached. If the parent has changed, then this requires all the same locks as we'd need to do a cross-directory rename. But we're already in lookup holding one parent's i_mutex, so it's too late to acquire those locks in the right order. The (unreliable) solution in __d_unalias is to trylock() the required locks and return -EBUSY if it fails. I see no particular reason for returning -EBUSY, and -ESTALE is already the result of some other lookup races on NFS. I think -ESTALE is the more helpful error return. It also allows us to take advantage of the logic Jeff Layton added in c6a9428401c0 "vfs: fix renameat to retry on ESTALE errors" and ancestors, which hopefully resolves some of these errors before they're returned to userspace. I can reproduce these cases using NFS with: ssh root@$client ' mount -olookupcache=pos '$server':'$export' /mnt/ mkdir /mnt/TO mkdir /mnt/DIR touch /mnt/DIR/test.txt while true; do strace -e open cat /mnt/DIR/test.txt 2>&1 \| grep EBUSY done ' ssh root@$server ' while true; do mv $export/DIR $export/TO/DIR mv $export/TO/DIR $export/DIR done ' It also helps to add some other concurrent use of the directory on the client (e.g., "ls /mnt/TO"). And you can replace the server-side mv's by client-side mv's that are repeatedly killed. (If the client is interrupted while waiting for the RENAME response then it's left with a dentry that has to go under one parent or the other, but it doesn't yet know which.) Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	NTFS: Version 2.1.32 - Update file write from aio_write to write_iter.	Anton Altaparmakov
	Signed-off-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	VFS: Add iov_iter_fault_in_multipages_readable()	Anton Altaparmakov
	simillar to iov_iter_fault_in_readable() but differs in that it is not limited to faulting in the first iovec and instead faults in "bytes" bytes iterating over the iovecs as necessary. Also, instead of only faulting in the first and last page of the range, all pages are faulted in. This function is needed by NTFS when it does multi page file writes. Signed-off-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	drop bogus check in file_open_root()	Al Viro
	For one thing, LOOKUP_DIRECTORY will be dealt with in do_last(). For another, name can be an empty string, but not NULL - no callers pass that and it would oops immediately if they would. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	switch security_inode_getattr() to struct path *	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	constify tomoyo_realpath_from_path()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	whack-a-mole: there's no point doing set_fs(USER_DS) in sigframe setup	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	whack-a-mole: no need to set_fs(USER_DS) in {start,flush}_thread()	Al Viro
	flush_old_exec() has already done that. Back on 2011 a bunch of instances like that had been kicked out, but that hadn't taken care of then-out-of-tree architectures, obviously, and they served as reinfection vector... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	remove incorrect comment in lookup_one_len()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	namei.c: fold do_path_lookup() into both callers	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	kill struct filename.separate	Al Viro
	just make const char iname[] the last member and compare name->name with name->iname instead of checking name->separate We need to make sure that out-of-line name doesn't end up allocated adjacent to struct filename refering to it; fortunately, it's easy to achieve - just allocate that struct filename with one byte in ->iname[], so that ->iname[0] will be inside the same object and thus have an address different from that of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	new helper: msg_data_left()	Al Viro
	convert open-coded instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11	Merge remote-tracking branch 'dh/afs' into for-davem	Al Viro

2015-04-11	get rid of the size argument of sock_sendmsg()	Al Viro
	it's equal to iov_iter_count(&msg->msg_iter) in all cases Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09	switch kernel_sendmsg() and kernel_recvmsg() to iov_iter_kvec()	Al Viro
	For kernel_sendmsg() that eliminates the need to play with setfs(); for kernel_recvmsg() it does not - a couple of callers are using it with non-NULL ->msg_control, which would be treated as userland address on recvmsg side of things. In all cases we are really setting a kvec-backed iov_iter, though. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09	net: switch importing msghdr from userland to {compat_,}import_iovec()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09	net: switch sendto() and recvfrom() to import_single_range()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09	Merge branch 'iov_iter' into for-davem	Al Viro

2015-04-09	Merge branch 'iocb' into for-davem	Al Viro
	trivial conflict in net/socket.c and non-trivial one in crypto - that one had evaded aio_complete() removal. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-07	Merge branch 'for-upstream' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-04-04 Here's what's probably the last bluetooth-next pull request for 4.1: - Fixes for LE advertising data & advertising parameters - Fix for race condition with HCI_RESET flag - New BNEPGETSUPPFEAT ioctl, needed for certification - New HCI request callback type to get the resulting skb - Cleanups to use BIT() macro wherever possible - Consolidate Broadcom device entries in the btusb HCI driver - Check for valid flags in CMTP, HIDP & BNEP - Disallow local privacy & OOB data combo to prevent a potential race - Expose SMP & ECDH selftest results through debugfs - Expose current Device ID info through debugfs Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Conflicts: drivers/net/ethernet/mellanox/mlx4/cmd.c net/core/fib_rules.c net/ipv4/fib_frontend.c The fib_rules.c and fib_frontend.c conflicts were locking adjustments in 'net' overlapping addition and removal of code in 'net-next'. The mlx4 conflict was a bug fix in 'net' happening in the same place a constant was being replaced with a more suitable macro. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06	Linux 4.0-rc7v4.0-rc7	Linus Torvalds

2015-04-06	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) In TCP, don't register an FRTO for cumulatively ACK'd data that was previously SACK'd, from Neal Cardwell. 2) Need to hold RNL mutex in ipv4 multicast code namespace cleanup, from Cong WANG. 3) Similarly we have to hold RNL mutex for fib_rules_unregister(), also from Cong WANG. 4) Revert and rework netns nsid allocation fix, from Nicolas Dichtel. 5) When we encapsulate for a tunnel device, skb->sk still points to the user socket. So this leads to cases where we retraverse the ipv4/ipv6 output path with skb->sk being of some other address family (f.e. AF_PACKET). This can cause things to crash since the ipv4 output path is dereferencing an AF_PACKET socket as if it were an ipv4 one. The short term fix for 'net' and -stable is to elide these socket checks once we've entered an encapsulation sequence by testing xmit_recursion. Longer term we have a better solution wherein we pass the tunnel's socket down through the output paths, but that is way too invasive for 'net' and -stable. From Hannes Frederic Sowa. 6) l2tp_init() failure path forgets to unregister per-net ops, from Cong WANG. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net/mlx4_core: Fix error message deprecation for ConnectX-2 cards net: dsa: fix filling routing table from OF description l2tp: unregister l2tp_net_ops on failure path mvneta: dont call mvneta_adjust_link() manually ipv6: protect skb->sk accesses from recursive dereference inside the stack netns: don't allocate an id for dead netns Revert "netns: don't clear nsid too early on removal" ip6mr: call del_timer_sync() in ip6mr_free_table() net: move fib_rules_unregister() under rtnl lock ipv4: take rtnl_lock and mark mrt table as freed on namespace cleanup tcp: fix FRTO undo on cumulative ACK of SACKed range xen-netfront: transmit fully GSO-sized packets