lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2010-04-27	NFS: Ensure that nfs_wb_page() waits for Pg_writeback to clear	Trond Myklebust
	Neil Brown reports that he is seeing the BUG_ON(ret == 0) trigger in nfs_page_async_flush. According to the trace in https://bugzilla.novell.com/show_bug.cgi?id=599628 the problem appears to be due to nfs_wb_page() not waiting for the PG_writeback flag to clear. There is a ditto problem in nfs_wb_page_cancel() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-22	NFS: Fix an unstable write data integrity race	Trond Myklebust
	Commit 2c61be0a9478258f77b66208a0c4b1f5f8161c3c (NFS: Ensure that the WRITE and COMMIT RPC calls are always uninterruptible) exposed a race on file close. In order to ensure correct close-to-open behaviour, we want to wait for all outstanding background commit operations to complete. This patch adds an inode flag that indicates if a commit operation is under way, and provides a mechanism to allow ->write_inode() to wait for its completion if this is a data integrity flush. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-22	nfs: testing for null instead of ERR_PTR()	Dan Carpenter
	nfs_path() returns an ERR_PTR(), it doesn't return null. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-22	NFS: rsize and wsize settings ignored on v4 mounts	Chuck Lever
	NFSv4 mounts ignore the rsize and wsize mount options, and always use the default transfer size for both. This seems to be because all NFSv4 mounts are now cloned, and the cloning logic doesn't copy the rsize and wsize settings from the parent nfs_server. I tested Fedora's 2.6.32.11-99 and it seems to have this problem as well, so I'm guessing that .33, .32, and perhaps older kernels have this issue as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Stable <stable@kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-22	NFSv4: Don't attempt an atomic open if the file is a mountpoint	Trond Myklebust
	Fix https://bugzilla.kernel.org/show_bug.cgi?id=15789 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-12	NFSv4: fix delegated locking	Trond Myklebust
	Arnaud Giersch reports that NFSv4 locking is broken when we hold a delegation since commit 8e469ebd6dc32cbaf620e134d79f740bf0ebab79 (NFSv4: Don't allow posix locking against servers that don't support it). According to Arnaud, the lock succeeds the first time he opens the file (since we cannot do a delegated open) but then fails after we start using delegated opens. The following patch fixes it by ensuring that locking behaviour is governed by a per-filesystem capability flag that is initially set, but gets cleared if the server ever returns an OPEN without the NFS4_OPEN_RESULT_LOCKTYPE_POSIX flag being set. Reported-by: Arnaud Giersch <arnaud.giersch@iut-bm.univ-fcomte.fr> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-04-09	NFS: Ensure that the WRITE and COMMIT RPC calls are always uninterruptible	Trond Myklebust
	We always want to ensure that WRITE and COMMIT completes, whether or not the user presses ^C. Do this by making the call asynchronous, and allowing the user to do an interruptible wait for rpc_task completion. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-09	NFS: Fix a race with the new commit code	Trond Myklebust
	This patch fixes a race which occurs due to the fact that we release the PG_writeback flag while still holding the nfs_page locked. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-09	NFS: Ensure that writeback_single_inode() calls write_inode() when syncing	Trond Myklebust
	Since writeback_single_inode() checks the inode->i_state flags _before_ it flushes out the data, we need to ensure that the I_DIRTY_DATASYNC flag is already set. Otherwise we risk not seeing a call to write_inode(), which again means that we break fsync() et al... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-09	NFS: Fix the mode calculation in nfs_find_open_context	Trond Myklebust
	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-09	NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR	Trond Myklebust
	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-04-07	Have nfs ->d_revalidate() report errors properly	Al Viro
	If nfs atomic open implementation ends up doing open request from ->d_revalidate() codepath and gets an error from server, return that error to caller explicitly and don't bother with lookup_instantiate_filp() at all. ->d_revalidate() can return an error itself just fine... See http://bugzilla.kernel.org/show_bug.cgi?id=15674 http://marc.info/?l=linux-kernel&m=126988782722711&w=2 for original report. Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-30	include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵	Tejun Heo
	implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-22	NFS: don't try to decode GETATTR if DELEGRETURN returned error	Jeff Layton
	The reply parsing code attempts to decode the GETATTR response even if the DELEGRETURN portion of the compound returned an error. The GETATTR response won't actually exist if that's the case and we're asking the parser to read past the end of the response. This bug is fairly benign. The parser catches this without reading past the end of the response and decode_getfattr returns -EIO. Earlier kernels however had decode_op_hdr using the READ_BUF macro, and this bug would make this printk pop any time the client got an error from a delegreturn: kernel: decode_op_hdr: reply buffer overflowed in line XXXX More recent kernels seem to have replaced this printk with a dprintk. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-19	NFS: Prevent another deadlock in nfs_release_page()	Trond Myklebust
	We should not attempt to free the page if __GFP_FS is not set. Otherwise we can deadlock as per http://bugzilla.kernel.org/show_bug.cgi?id=15578 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-03-15	NFS: ensure bdi_unregister is called on mount failure.	NeilBrown
	bdi_unregister is called by nfs_put_super which is only called by generic_shutdown_super if ->s_root is not NULL. So if we error out in a circumstance where we called nfs_bdi_register (i.e. server != NULL) but have not set s_root, then we need to call bdi_unregister explicitly in nfs_get_sb and various other *_get_sb() functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-11	NFS: Avoid a deadlock in nfs_release_page	Trond Myklebust
	J.R. Okajima reports the following deadlock: INFO: task kswapd0:305 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kswapd0 D 0000000000000001 0 305 2 0x00000000 ffff88001f21d4f0 0000000000000046 ffff88001fdea680 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21dfd8 ffff88001fdea040 0000000000014c00 0000000000000001 ffff88001fdea040 Call Trace: [<ffffffff8146155d>] io_schedule+0x4d/0x70 [<ffffffff810d2be5>] sync_page+0x65/0xa0 [<ffffffff81461b12>] __wait_on_bit_lock+0x52/0xb0 [<ffffffff810d2b80>] ? sync_page+0x0/0xa0 [<ffffffff810d2b64>] __lock_page+0x64/0x70 [<ffffffff81070ce0>] ? wake_bit_function+0x0/0x40 [<ffffffff810df1d4>] truncate_inode_pages_range+0x344/0x4a0 [<ffffffff810df340>] truncate_inode_pages+0x10/0x20 [<ffffffff8112cbfe>] generic_delete_inode+0x15e/0x190 [<ffffffff8112cc8d>] generic_drop_inode+0x5d/0x80 [<ffffffff8112bb88>] iput+0x78/0x80 [<ffffffff811bc908>] nfs_dentry_iput+0x38/0x50 [<ffffffff811285f4>] dentry_iput+0x84/0x110 [<ffffffff811286ae>] d_kill+0x2e/0x60 [<ffffffff8112912a>] dput+0x7a/0x170 [<ffffffff8111e925>] path_put+0x15/0x40 [<ffffffff811c3a44>] __put_nfs_open_context+0xa4/0xb0 [<ffffffff811cb5d0>] ? nfs_free_request+0x0/0x50 [<ffffffff811c3b0b>] put_nfs_open_context+0xb/0x10 [<ffffffff811cb5f9>] nfs_free_request+0x29/0x50 [<ffffffff81234b7e>] kref_put+0x8e/0xe0 [<ffffffff811cb594>] nfs_release_request+0x14/0x20 [<ffffffff811cf769>] nfs_find_and_lock_request+0x89/0xa0 [<ffffffff811d1180>] nfs_wb_page+0x80/0x110 [<ffffffff811c0770>] nfs_release_page+0x70/0x90 [<ffffffff810d18ee>] try_to_release_page+0x5e/0x80 [<ffffffff810e1178>] shrink_page_list+0x638/0x860 [<ffffffff810e19de>] shrink_zone+0x63e/0xc40 We can fix this by making the call to put_nfs_open_context() happen when we actually remove the write request from the inode (which is done by the nfsiod thread in this case). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-03-10	NFSv4: Don't ignore the NFS_INO_REVAL_FORCED flag in nfs_revalidate_inode()	Trond Myklebust
	If the NFS_INO_REVAL_FORCED flag is set, that means that we don't yet have an up to date attribute cache. Even if we hold a delegation, we must put a GETATTR on the wire. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-03-08	nfs4: Make the v4 callback service hidden	Steve Dickson
	To avoid hangs in the svc_unregister(), on version 4 mounts (and unmounts), when rpcbind is not running, make the nfs4 callback program an 'hidden' service by setting the 'vs_hidden' flag in the nfs4_callback_version structure. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-08	nfs: fix unlikely memory leak	Dan Carpenter
	I'll admit that it's unlikely for the first allocation to fail and the second one to succeed. I won't be offended if you ignore this patch. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-06	Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux	Linus Torvalds
	* 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: (22 commits) nfsd4: fix minor memory leak svcrpc: treat uid's as unsigned nfsd: ensure sockets are closed on error Revert "sunrpc: move the close processing after do recvfrom method" Revert "sunrpc: fix peername failed on closed listener" sunrpc: remove unnecessary svc_xprt_put NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN xfs_export_operations.commit_metadata commit_metadata export operation replacing nfsd_sync_dir lockd: don't clear sm_monitored on nsm_reboot_lookup lockd: release reference to nsm_handle in nlm_host_rebooted nfsd: Use vfs_fsync_range() in nfsd_commit NFSD: Create PF_INET6 listener in write_ports SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() NFSD: Support AF_INET6 in svc_addsock() function SUNRPC: Use rpc_pton() in ip_map_parse() nfsd: 4.1 has an rfc number nfsd41: Create the recovery entry for the NFSv4.1 client nfsd: use vfs_fsync for non-directories ...
2010-03-05	Merge branch 'writeback-for-2.6.34' into nfs-for-2.6.34	Trond Myklebust

2010-03-05	NFS: Remove requirement for inode->i_mutex from nfs_invalidate_mapping	Trond Myklebust
	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Clean up nfs_sync_mapping	Trond Myklebust
	Remove the redundant call to filemap_write_and_wait(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Simplify nfs_wb_page()	Trond Myklebust
	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Replace __nfs_write_mapping with sync_inode()	Trond Myklebust
	Now that we have correct COMMIT semantics in writeback_single_inode, we can reduce and simplify nfs_wb_all(). Also replace nfs_wb_nocommit() with a call to filemap_write_and_wait(), which doesn't need to hold the inode->i_mutex. With that done, we can eliminate nfs_write_mapping() altogether. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Simplify nfs_wb_page_cancel()	Trond Myklebust
	In all cases we should be able to just remove the request and call cancel_dirty_page(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Ensure inode is always marked I_DIRTY_DATASYNC, if it has unstable pages	Trond Myklebust
	Since nfs_scan_list() doesn't wait for locked pages, we have a race in which it is possible to end up with an inode that needs to send a COMMIT, but which does not have the I_DIRTY_DATASYNC flag set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is set	Trond Myklebust
	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Wu Fengguang <fengguang.wu@intel.com>
2010-03-05	NFS: Reduce the number of unnecessary COMMIT calls	Trond Myklebust
	If the caller is doing a non-blocking flush, and there are still writebacks pending on the wire, we can usually defer the COMMIT call until those writes are done. Also ensure that we honour the wbc->nonblocking flag. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Add a count of the number of unstable writes carried by an inode	Trond Myklebust
	In order to know when we should do opportunistic commits of the unstable writes, when the VM is doing a background flush, we add a field to count the number of unstable writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	NFS: Cleanup - move nfs_write_inode() into fs/nfs/write.c	Trond Myklebust
	The sole purpose of nfs_write_inode is to commit unstable writes, so move it into fs/nfs/write.c, and make nfs_commit_inode static. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-05	pass writeback_control to ->write_inode	Christoph Hellwig
	This gives the filesystem more information about the writeback that is happening. Trond requested this for the NFS unstable write handling, and other filesystems might benefit from this too by beeing able to distinguish between the different callers in more detail. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-05	make sure data is on disk before calling ->write_inode	Christoph Hellwig
	Similar to the fsync issue fixed a while ago in commit 2daea67e966dc0c42067ebea015ddac6834cef88 we need to write for data to actually hit the disk before writing out the metadata to guarantee data integrity for filesystems that modify the inode in the data I/O completion path. Currently XFS and NFS handle this manually, and AFS has a write_inode method that does nothing but waiting for data, while others are possibly missing out on this. Fortunately this change has a lot less impact than the fsync change as none of the write_inode methods starts data writeout of any form by itself. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-04	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs into for-2.6.34-incoming	J. Bruce Fields
	Resolve merge conflict in fs/xfs/linux-2.6/xfs_export.c.
2010-03-04	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) init: Open /dev/console from rootfs mqueue: fix typo "failues" -> "failures" mqueue: only set error codes if they are really necessary mqueue: simplify do_open() error handling mqueue: apply mathematics distributivity on mq_bytes calculation mqueue: remove unneeded info->messages initialization mqueue: fix mq_open() file descriptor leak on user-space processes fix race in d_splice_alias() set S_DEAD on unlink() and non-directory rename() victims vfs: add NOFOLLOW flag to umount(2) get rid of ->mnt_parent in tomoyo/realpath hppfs can use existing proc_mnt, no need for do_kern_mount() in there Mirror MS_KERNMOUNT in ->mnt_flags get rid of useless vfsmount_lock use in put_mnt_ns() Take vfsmount_lock to fs/internal.h get rid of insanity with namespace roots in tomoyo take check for new events in namespace (guts of mounts_poll()) to namespace.c Don't mess with generic_permission() under ->d_lock in hpfs sanitize const/signedness for udf nilfs: sanitize const/signedness in dealing with ->d_name.name ... Fix up fairly trivial (famous last words...) conflicts in drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
2010-03-03	a couple of mntget+dget -> path_get in nfs4proc	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03	Switch alloc_nfs_open_context() to struct path	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.
2010-03-02	nfs41 fix NFS4ERR_CLID_INUSE for exchange id	Andy Adamson
	Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02	NFS: Fix an allocation-under-spinlock bug	Trond Myklebust
	sunrpc_cache_update() will always call detail->update() from inside the detail->hash_lock, so it cannot allocate memory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-03-02	NFSv4.1: Various fixes to the sequence flag error handling	Trond Myklebust
	Ensure that we change the EXCHANGE_ID verifier (i.e. clp->cl_boot_time) when we want to reset all state. This is mainly needed when the server tells us that it is revoking our open or lock stateids. Handle revoking of recallable state by expiring the delegations. Handle callback path issues by expiring the delegations and then resetting the session. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02	nfs4: renewd renew operations should take/put a client reference	Alexandros Batsakis
	renewd sends RENEW requests to the NFS server in order to renew state. As the request is asynchronous, renewd should take a reference to the nfs_client to prevent concurrent umounts from freeing the client Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02	nfs41: renewd sequence operations should take/put client reference	Alexandros Batsakis
	renewd sends SEQUENCE requests to the NFS server in order to renew state. As the request is asynchronous, renewd should take a reference to the nfs_client to prevent concurrent umounts from freeing the session/client Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02	nfs: prevent backlogging of renewd requests	Alexandros Batsakis
	If the renewd send queue gets backlogged (e.g., if the server goes down), we will keep filling the queue with periodic RENEW/SEQUENCE requests. This patch schedules a new renewd request if and only if the previous one returns (either success or failure) Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [Trond.Myklebust@netapp.com: moved nfs4_schedule_state_renewal() into separate nfs4_renew_release() and nfs41_sequence_release() callbacks to ensure correct behaviour on call setup failure] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02	nfs: kill renewd before clearing client minor version	Alexandros Batsakis
	renewd should be synchronously killed before we destroy the session in nfs4_clear_minor_version Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [Trond.Myklebust@netapp.com: clean up to remove 'unused function warning when !CONFIG_NFS_V4] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-02-26	Remove EXPERIMENTAL from NFS_FSCACHE	Christian Kujau
	There's currently an open Ubuntu bug[0], with the intent to compile NFS_FSCACHE (and possibly AFS_FSCACHE, 9P_FSCACHE) into the standard Ubuntu kernel. However, since *_FSCACHE still depends on EXPERIMENTAL, this won't happen. As Arjan van de Ven pointed out[1], the EXPERIMENTAL flag doesn't mean that much any more, I propose the following patch to fs/nfs/Kconfig. I'd do the same for fs/9p/Kconfig and fs/afs/Kconfig, but as I did not test 9p or AFS, I feel it would not be appropriate for me to remove the flag. [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/440522/comments/5 [1] http://lkml.org/lkml/2010/1/23/145 Signed-off-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-17	percpu: add __percpu sparse annotations to fs	Tejun Heo
	Add __percpu sparse annotations to fs. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
2010-02-15	NFS: Too many GETATTR and ACCESS calls after direct I/O	Chuck Lever
	The cached read and write paths initialize fattr->time_start in their setup procedures. The value of fattr->time_start is propagated to read_cache_jiffies by nfs_update_inode(). Subsequent calls to nfs_attribute_timeout() will then use a good time stamp when computing the attribute cache timeout, and squelch unneeded GETATTR calls. Since the direct I/O paths erroneously leave the inode's fattr->time_start field set to zero, read_cache_jiffies for that inode is set to zero after any direct read or write operation. This triggers an otw GETATTR or ACCESS call to update the file's attribute and access caches properly, even when the NFS READ or WRITE replies have usable post-op attributes. Make sure the direct read and write setup code performs the same fattr initialization as the cached I/O paths to prevent unnecessary GETATTR calls. This was likely introduced by commit 0e574af1 in 2.6.15, which appears to add new nfs_fattr_init() call sites in the cached read and write paths, but not in the equivalent places in fs/nfs/direct.c. A subsequent commit in the same series, 33801147, introduces the fattr->time_start field. Interestingly, the direct write reschedule path already has a call to nfs_fattr_init() in the right place. Reported-by: Quentin Barnes <qbarnes@yahoo-inc.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-10	NFS: Make close(2) asynchronous when closing NFS O_DIRECT files	Chuck Lever
	For NFSv2 and v3: O_DIRECT writes are always synchronous, and aren't cached, so nothing should be flushed when closing an NFS O_DIRECT file descriptor. Thus there are no write errors to report on close(2). In addition, there's no cached data to verify on the next open(2), so we don't need clean GETATTR results at close time to compare with. Thus, there's no need for the nfs_revalidate_inode() call when closing an NFS O_DIRECT file. This reduces the number of synchronous on-the-wire requests for a simple open-write-close of an NFS O_DIRECT file by roughly 20%. For NFSv4: Call nfs4_do_close() with wait set to zero when closing an NFS O_DIRECT file. The CLOSE will go on the wire, but the application won't wait for it to complete. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>