Age | Commit message (Collapse) | Author |
|
Re-enabling NFSv3 LOCALIO is made more complex (than NFSv4) because v3
is stateless. As such, the hueristic used to identify a LOCALIO probe
point is more adhoc by nature: if/when NFSv3 client IO begins to
complete again in terms of normal RPC-based NFSv3 server IO, attempt
nfs_local_probe_async().
Care is taken to throttle the frequency of nfs_local_probe_async(),
otherwise there could be a flood of repeat calls to
nfs_local_probe_async().
The throttle is admin controlled using a new module parameter for
nfsv3, e.g.:
echo 512 > /sys/module/nfsv3/parameters/nfs3_localio_probe_throttle
Probe for NFSv3 LOCALIO every N IO requests (512 in this case). Must
be power-of-2, defaults to 0 (probing disabled).
On systems that expect to use LOCALIO with NFSv3 the admin should
configure the 'nfs3_localio_probe_throttle' module parameter.
This commit backfills module parameter documentation in localio.rst
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Also update Documentation/filesystems/nfs/localio.rst accordingly
and reduce the technical documentation debt that was previously
captured in that document.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
This commit simply adds the required O_DIRECT plumbing. It doesn't
address the fact that NFS doesn't ensure all writes are page aligned
(nor device logical block size aligned as required by O_DIRECT).
Because NFS will read-modify-write for IO that isn't aligned, LOCALIO
will not use O_DIRECT semantics by default if/when an application
requests the use of O_DIRECT. Allow the use of O_DIRECT semantics by:
1: Adding a flag to the nfs_pgio_header struct to allow the NFS
O_DIRECT layer to signal that O_DIRECT was used by the application
2: Adding a 'localio_O_DIRECT_semantics' NFS module parameter that
when enabled will cause LOCALIO to use O_DIRECT semantics (this may
cause IO to fail if applications do not properly align their IO).
This commit is derived from code developed by Weston Andros Adamson.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Benjamin Coddington <bcodding@redhat.com> says:
Last year both GFS2 and OCFS2 had some work done to make their locking more
robust when exported over NFS. Unfortunately, part of that work caused both
NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
lock notifications to clients.
This in itself is not a huge problem because most NFS clients will still
poll the server in order to acquire a conflicted lock, but now that I've
noticed it I can't help but try to fix it because there are big advantages
for setups that might depend on timely lock notifications, and we've
supported that as a feature for a long time.
Its important for NLM and kNFSD that they do not block their kernel threads
inside filesystem's file_lock implementations because that can produce
deadlocks. We used to make sure of this by only trusting that
posix_lock_file() can correctly handle blocking lock calls asynchronously,
so the lock managers would only setup their file_lock requests for async
callbacks if the filesystem did not define its own lock() file operation.
However, when GFS2 and OCFS2 grew the capability to correctly
handle blocking lock requests asynchronously, they started signalling this
behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting
posix_lock_file() was inadvertently dropped, so now most filesystems no
longer produce lock notifications when exported over NFS.
I tried to fix this by simply including the old check for lock(), but the
resulting include mess and layering violations was more than I could accept.
There's a much cleaner way presented here using an fop_flag, which while
potentially flag-greedy, greatly simplifies the problem and grooms the
way for future uses by both filesystems and lock managers alike.
* patches from https://lore.kernel.org/r/cover.1726083391.git.bcodding@redhat.com:
exportfs: Remove EXPORT_OP_ASYNC_LOCK
NLM/NFSD: Fix lock notifications for async-capable filesystems
gfs2/ocfs2: set FOP_ASYNC_LOCK
fs: Introduce FOP_ASYNC_LOCK
NFS: trace: show TIMEDOUT instead of 0x6e
nfsd: use system_unbound_wq for nfsd_file_gc_worker()
nfsd: count nfsd_file allocations
nfsd: fix refcount leak when file is unhashed after being found
nfsd: remove unneeded EEXIST error check in nfsd_do_file_acquire
nfsd: add list_head nf_gc to struct nfsd_file
Link: https://lore.kernel.org/r/cover.1726083391.git.bcodding@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Now that GFS2 and OCFS2 are signalling async ->lock() support with
FOP_ASYNC_LOCK and checks for support are converted, we can remove
EXPORT_OP_ASYNC_LOCK.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/0a114db814fec3086f937ae3d44a086f13b8de26.1726083391.git.bcodding@redhat.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 92945bd81ca4 ("nfs: add Documentation/filesystems/nfs/localio.rst")
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
This section answers a new FAQ entry:
9. How does LOCALIO make certain that object lifetimes are managed
properly given NFSD and NFS operate in different contexts?
See the detailed "NFS Client and Server Interlock" section below.
The first half of the section details NeilBrown's elegant design
for LOCALIO's nfs_uuid_t based interlock and is heavily based on
Neil's "net namespace refcounting" description here:
https://marc.info/?l=linux-nfs&m=172498546024767&w=2
The second half of the section details the per-cpu-refcount introduced
to ensure NFSD's nfsd_serv isn't destroyed while in use by a LOCALIO
client.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Add a FAQ section to give answers to questions that have been raised
during review of the localio feature.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
This document gives an overview of the LOCALIO auxiliary RPC protocol
added to the Linux NFS client and server to allow them to reliably
handshake to determine if they are on the same host.
Once an NFS client and server handshake as "local", the client will
bypass the network RPC protocol for read, write and commit operations.
Due to this XDR and RPC bypass, these operations will operate faster.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fanotify fsid updates from Christian Brauner:
"This work is part of the plan to enable fanotify to serve as a drop-in
replacement for inotify. While inotify is availabe on all filesystems,
fanotify currently isn't.
In order to support fanotify on all filesystems two things are needed:
(1) all filesystems need to support AT_HANDLE_FID
(2) all filesystems need to report a non-zero f_fsid
This contains (1) and allows filesystems to encode non-decodable file
handlers for fanotify without implementing any exportfs operations by
encoding a file id of type FILEID_INO64_GEN from i_ino and
i_generation.
Filesystems that want to opt out of encoding non-decodable file ids
for fanotify that don't support NFS export can do so by providing an
empty export_operations struct.
This also partially addresses (2) by generating f_fsid for simple
filesystems as well as freevxfs. Remaining filesystems will be dealt
with by separate patches.
Finally, this contains the patch from the current exportfs maintainers
which moves exportfs under vfs with Chuck, Jeff, and Amir as
maintainers and vfs.git as tree"
* tag 'vfs-6.7.fsid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
MAINTAINERS: create an entry for exportfs
fs: fix build error with CONFIG_EXPORTFS=m or not defined
freevxfs: derive f_fsid from bdev->bd_dev
fs: report f_fsid from s_dev for "simple" filesystems
exportfs: support encoding non-decodeable file handles by default
exportfs: define FILEID_INO64_GEN* file handle types
exportfs: make ->encode_fh() a mandatory method for NFS export
exportfs: add helpers to check if filesystem can encode/decode file handles
|
|
Rename the default helper for encoding FILEID_INO32_GEN* file handles to
generic_encode_ino32_fh() and convert the filesystems that used the
default implementation to use the generic helper explicitly.
After this change, exportfs_encode_inode_fh() no longer has a default
implementation to encode FILEID_INO32_GEN* file handles.
This is a step towards allowing filesystems to encode non-decodeable
file handles for fanotify without having to implement any
export_operations.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20231023180801.2953446-3-amir73il@gmail.com
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This patch reverts mostly commit 40595cdc93ed ("nfs: block notification
on fs with its own ->lock") and introduces an EXPORT_OP_ASYNC_LOCK
export flag to signal that the "own ->lock" implementation supports
async lock requests. The only main user is DLM that is used by GFS2 and
OCFS2 filesystem. Those implement their own lock() implementation and
return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93ed
("nfs: block notification on fs with its own ->lock") the DLM
implementation were never updated. This patch should prepare for DLM
to set the EXPORT_OP_ASYNC_LOCK export flag and update the DLM
plock implementation regarding to it.
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Pull nfsd updates from Chuck Lever:
"I'm thrilled to announce that the Linux in-kernel NFS server now
offers NFSv4 write delegations. A write delegation enables a client to
cache data and metadata for a single file more aggressively, reducing
network round trips and server workload. Many thanks to Dai Ngo for
contributing this facility, and to Jeff Layton and Neil Brown for
reviewing and testing it.
This release also sees the removal of all support for DES- and
triple-DES-based Kerberos encryption types in the kernel's SunRPC
implementation. These encryption types have been deprecated by the
Internet community for years and are considered insecure. This change
affects both the in-kernel NFS client and server.
The server's UDP and TCP socket transports have now fully adopted
David Howells' new bio_vec iterator so that no more than one sendmsg()
call is needed to transmit each RPC message. In particular, this helps
kTLS optimize record boundaries when sending RPC-with-TLS replies, and
it takes the server a baby step closer to handling file I/O via
folios.
We've begun work on overhauling the SunRPC thread scheduler to remove
a costly linked-list walk when looking for an idle RPC service thread
to wake. The pre-requisites are included in this release. Thanks to
Neil Brown for his ongoing work on this improvement"
* tag 'nfsd-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (56 commits)
Documentation: Add missing documentation for EXPORT_OP flags
SUNRPC: Remove unused declaration rpc_modcount()
SUNRPC: Remove unused declarations
NFSD: da_addr_body field missing in some GETDEVICEINFO replies
SUNRPC: Remove return value of svc_pool_wake_idle_thread()
SUNRPC: make rqst_should_sleep() idempotent()
SUNRPC: Clean up svc_set_num_threads
SUNRPC: Count ingress RPC messages per svc_pool
SUNRPC: Deduplicate thread wake-up code
SUNRPC: Move trace_svc_xprt_enqueue
SUNRPC: Add enum svc_auth_status
SUNRPC: change svc_xprt::xpt_flags bits to enum
SUNRPC: change svc_rqst::rq_flags bits to enum
SUNRPC: change svc_pool::sp_flags bits to enum
SUNRPC: change cache_head.flags bits to enum
SUNRPC: remove timeout arg from svc_recv()
SUNRPC: change svc_recv() to return void.
SUNRPC: call svc_process() from svc_recv().
nfsd: separate nfsd_last_thread() from nfsd_put()
nfsd: Simplify code around svc_exit_thread() call in nfsd()
...
|
|
The commits that introduced these flags neglected to update the
Documentation/filesystems/nfs/exporting.rst file.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Fix typos in Documentation.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230814212822.193684-4-helgaas@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
So far, all callers of exportfs_encode_inode_fh(), except for fsnotify's
show_mark_fhandle(), check that filesystem can decode file handles, but
we would like to add more callers that do not require a file handle that
can be decoded.
Introduce a flag to explicitly request a file handle that may not to be
decoded later and a wrapper exportfs_encode_fid() that sets this flag
and convert show_mark_fhandle() to use the new wrapper.
This will be used to allow adding fanotify support to filesystems that
do not support NFS export.
Acked-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230502124817.3070545-3-amir73il@gmail.com>
|
|
The sysfs path for the NFS4 client identfier should start with
the path component of 'nfs' for the kset, and then the 'net'
path component for the netns object, followed by the
'nfs_client' path component for the NFS client kobject,
and ending with 'identifier' for the netns_client_id
kobj_attribute.
Fixes: a28faaddb2be ("Documentation: Add an explanation of NFSv4 client identifiers")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1801326
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
To enable NFSv4 to work correctly, NFSv4 client identifiers have
to be globally unique and persistent over client reboots. We
believe that in many cases, a good default identifier can be
chosen and set when a client system is imaged.
Because there are many different ways a system can be imaged,
provide an explanation of how NFSv4 client identifiers and
principals can be set by install scripts and imaging tools.
Additional cases, such as NFSv4 clients running in containers, also
need unique and persistent identifiers. The Linux NFS community
sets forth this explanation to aid those who create and manage
container environments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
We've supported reexport for a while but documentation is limited. This
is mainly a simplified version of the text I wrote for the linux-nfs
wiki at https://wiki.linux-nfs.org/wiki/index.php/NFS_re-export.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
It's not uncommon for some workloads to do a bunch of I/O to a file and
delete it just afterward. If knfsd has a cached open file however, then
the file may still be open when the dentry is unlinked. If the
underlying filesystem is nfs, then that could trigger it to do a
sillyrename.
On a REMOVE or RENAME scan the nfsd_file cache for open files that
correspond to the inode, and proactively unhash and put their
references. This should prevent any delete-on-last-close activity from
occurring, solely due to knfsd's open file cache.
This must be done synchronously though so we use the variants that call
flush_delayed_fput. There are deadlock possibilities if you call
flush_delayed_fput while holding locks, however. In the case of
nfsd_rename, we don't even do the lookups of the dentries to be renamed
until we've locked for rename.
Once we've figured out what the target dentry is for a rename, check to
see whether there are cached open files associated with it. If there
are, then unwind all of the locking, close them all, and then reattempt
the rename.
None of this is really necessary for "typical" filesystems though. It's
mostly of use for NFS, so declare a new export op flag and use that to
determine whether to close the files beforehand.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
When we start allowing NFS to be reexported, then we have some problems
when it comes to subtree checking. In principle, we could allow it, but
it would mean encoding parent info in the filehandles and there may not
be enough space for that in a NFSv3 filehandle.
To enforce this at export upcall time, we add a new export_ops flag
that declares the filesystem ineligible for subtree checking.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
With NFSv3 nfsd will always attempt to send along WCC data to the
client. This generally involves saving off the in-core inode information
prior to doing the operation on the given filehandle, and then issuing a
vfs_getattr to it after the op.
Some filesystems (particularly clustered or networked ones) have an
expensive ->getattr inode operation. Atomicity is also often difficult
or impossible to guarantee on such filesystems. For those, we're best
off not trying to provide WCC information to the client at all, and to
simply allow it to poll for that information as needed with a GETATTR
RPC.
This patch adds a new flags field to struct export_operations, and
defines a new EXPORT_OP_NOWCC flag that filesystems can use to indicate
that nfsd should not attempt to provide WCC info in NFSv3 replies. It
also adds a blurb about the new flags field and flag to the exporting
documentation.
The server will also now skip collecting this information for NFSv2 as
well, since that info is never used there anyway.
Note that this patch does not add this flag to any filesystem
export_operations structures. This was originally developed to allow
reexporting nfs via nfsd.
Other filesystems may want to consider enabling this flag too. It's hard
to tell however which ones have export operations to enable export via
knfsd and which ones mostly rely on them for open-by-filehandle support,
so I'm leaving that up to the individual maintainers to decide. I am
cc'ing the relevant lists for those filesystems that I think may want to
consider adding this though.
Cc: HPDD-discuss@lists.01.org
Cc: ceph-devel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: fuse-devel@lists.sourceforge.net
Cc: ocfs2-devel@oss.oracle.com
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
This draft is an official RFC now.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://lore.kernel.org/r/20200621133552.46371-1-grandmaster@al2klimov.de
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert knfsd-stats.txt to ReST. Content remains mostly the same.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/20200129044917.566906-6-dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert nfs41-server.txt to ReST. ASCII tables were converted to ReST grid
table format.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/20200129044917.566906-5-dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert rpc-server-gss.txt to ReST. Content remains mostly unchanged.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/20200129044917.566906-4-dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert rpc-cache.txt to ReST. Changes aim to improve presentation
but the content itself remains mostly the same.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/20200129044917.566906-3-dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert pnfs.txt to ReST. Content remains mostly unchanged.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/20200129044917.566906-2-dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert fault_injection.txt to ReST and move it to admin-guide.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/f7b0cf8fb1159a668f75ce82a581e7590568c2b8.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert pnfs-scsi-server to ReST and move it to admin-guide. Content
remains mostly unchanged.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/5c4b8af41ca0a427a3987535815bccf47a65d320.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert pnfs-block-server.txt to ReST and move it to admin-guide.
Content remains mostly unchanged.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/c06903760e690c16d9df92f5e75f80381d6326d8.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert idmapper.txt to ReST and move it to admin-guide.
Content remains mostly unchanged otherwise.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/069e40cd551ea778538f8fe9ad15ee26e45fc748.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert nfsd-admin-interfaces to ReST and move it into admin-guide.
Content remains mostly untouched.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/d471305e9c96dec38f18d2ff816fca2269a88e29.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert nfs-rdma to ReST and move it to admin-guide. Content
remais mostly untouched. Also, mark the doc as obsolete.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/9c88f184f9de2a3eb5181563e258559efc02f58a.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Convert nfsroot.txt to RST and move it to admin-guide. Content remains
mostly the same.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/442d35917351f5260dd8ed7362e9b5f1264ef8ad.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
This patch converts nfs.txt to RST. It also moves it to admin-guide.
The reason for moving it is because this document contains information
useful for system administrators, as noted on the following paragraph:
'The purpose of this document is to provide information on some of the
special features of the NFS client that can be configured by system
administrators'.
Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com>
Link: https://lore.kernel.org/r/cb9f2da2f2f6dd432b4cf9e05f79f74f4d54b6ab.1578697871.git.dwlsalmeida@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
There are 3 remaining files without an extension inside the fs docs
dir.
Manually convert them to ReST.
In the case of the nfs/exporting.rst file, as the nfs docs
aren't ported yet, I opted to convert and add a :orphan: there,
with should be removed when it gets added into a nfs-specific
part of the fs documentation.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Those documents describe a kAPI. So, add to the driver-api
book.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
The two files there describes a Kernel API feature, used to
support early userspace stuff. Prepare for moving them to
the kernel API book by converting to ReST format.
The conversion itself was quite trivial: just add/mark a few
titles as such, add a literal block markup, add a table markup
and a few blank lines, in order to make Sphinx to properly parse it.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Pull nfsd updates from Bruce Fields:
"Olga added support for the NFSv4.2 asynchronous copy protocol. We
already supported COPY, by copying a limited amount of data and then
returning a short result, letting the client resend. The asynchronous
protocol should offer better performance at the expense of some
complexity.
The other highlight is Trond's work to convert the duplicate reply
cache to a red-black tree, and to move it and some other server caches
to RCU. (Previously these have meant taking global spinlocks on every
RPC)
Otherwise, some RDMA work and miscellaneous bugfixes"
* tag 'nfsd-4.20' of git://linux-nfs.org/~bfields/linux: (30 commits)
lockd: fix access beyond unterminated strings in prints
nfsd: Fix an Oops in free_session()
nfsd: correctly decrement odstate refcount in error path
svcrdma: Increase the default connection credit limit
svcrdma: Remove try_module_get from backchannel
svcrdma: Remove ->release_rqst call in bc reply handler
svcrdma: Reduce max_send_sges
nfsd: fix fall-through annotations
knfsd: Improve lookup performance in the duplicate reply cache using an rbtree
knfsd: Further simplify the cache lookup
knfsd: Simplify NFS duplicate replay cache
knfsd: Remove dead code from nfsd_cache_lookup
SUNRPC: Simplify TCP receive code
SUNRPC: Replace the cache_detail->hash_lock with a regular spinlock
SUNRPC: Remove non-RCU protected lookup
NFS: Fix up a typo in nfs_dns_ent_put
NFS: Lockless DNS lookups
knfsd: Lockless lookup of NFSv4 identities.
SUNRPC: Lockless server RPCSEC_GSS context lookup
knfsd: Allow lockless lookups of the exports
...
|
|
Clean up the cache code by removing the non-RCU protected lookup.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
This is a respin with a wider audience (all that get_maintainer returned)
and I know this spams a *lot* of people. Not sure what would be the correct
way, so my apologies for ruining your inbox.
The 00-INDEX files are supposed to give a summary of all files present
in a directory, but these files are horribly out of date and their
usefulness is brought into question. Often a simple "ls" would reveal
the same information as the filenames are generally quite descriptive as
a short introduction to what the file covers (it should not surprise
anyone what Documentation/sched/sched-design-CFS.txt covers)
A few years back it was mentioned that these files were no longer really
needed, and they have since then grown further out of date, so perhaps
it is time to just throw them out.
A short status yields the following _outdated_ 00-INDEX files, first
counter is files listed in 00-INDEX but missing in the directory, last
is files present but not listed in 00-INDEX.
List of outdated 00-INDEX:
Documentation: (4/10)
Documentation/sysctl: (0/1)
Documentation/timers: (1/0)
Documentation/blockdev: (3/1)
Documentation/w1/slaves: (0/1)
Documentation/locking: (0/1)
Documentation/devicetree: (0/5)
Documentation/power: (1/1)
Documentation/powerpc: (0/5)
Documentation/arm: (1/0)
Documentation/x86: (0/9)
Documentation/x86/x86_64: (1/1)
Documentation/scsi: (4/4)
Documentation/filesystems: (2/9)
Documentation/filesystems/nfs: (0/2)
Documentation/cgroup-v1: (0/2)
Documentation/kbuild: (0/4)
Documentation/spi: (1/0)
Documentation/virtual/kvm: (1/0)
Documentation/scheduler: (0/2)
Documentation/fb: (0/1)
Documentation/block: (0/1)
Documentation/networking: (6/37)
Documentation/vm: (1/3)
Then there are 364 subdirectories in Documentation/ with several files that
are missing 00-INDEX alltogether (and another 120 with a single file and no
00-INDEX).
I don't really have an opinion to whether or not we /should/ have 00-INDEX,
but the above 00-INDEX should either be removed or be kept up to date. If
we should keep the files, I can try to keep them updated, but I rather not
if we just want to delete them anyway.
As a starting point, remove all index-files and references to 00-INDEX and
see where the discussion is going.
Signed-off-by: Henrik Austad <henrik@austad.us>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Just-do-it-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: [Almost everybody else]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Distributed filesystems are most effective when the server and client
clocks are synchronised. Embedded devices often use NFS for their
root filesystem but typically do not contain an RTC, so the clocks of
the NFS server and the embedded device will be out-of-sync when the root
filesystem is mounted (and may not be synchronised until late in the
boot process).
Extend ipconfig with the ability to export IP addresses of NTP servers
it discovers to /proc/net/ipconfig/ntp_servers. They can be supplied as
follows:
- If ipconfig is configured manually via the "ip=" or "nfsaddrs="
kernel command line parameters, one NTP server can be specified in
the new "<ntp0-ip>" parameter.
- If ipconfig is autoconfigured via DHCP, request DHCP option 42 in
the DHCPDISCOVER message, and record the IP addresses of up to three
NTP servers sent by the responding DHCP server in the subsequent
DHCPOFFER message.
ipconfig will only write the NTP server IP addresses it discovers to
/proc/net/ipconfig/ntp_servers, one per line (in the order received from
the DHCP server, if DHCP autoconfiguration is used); making use of these
NTP servers is the responsibility of a user space process (e.g. an
initrd/initram script that invokes an NTP client before mounting an NFS
root filesystem).
Signed-off-by: Chris Novakovic <chris@chrisn.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fully document the format used by the /proc/net/pnp file written by
ipconfig, explain where its values originate from, and clarify that the
tertiary name server IP and DNS domain name are only written to the file
when autoconfiguration is used.
Signed-off-by: Chris Novakovic <chris@chrisn.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ic_do_bootp_ext() is responsible for parsing the "ip=" and "nfsaddrs="
kernel parameters. If a "." character is found in parameter 4 (the
client's hostname), everything before the first "." is used as the
hostname, and everything after it is used as the NIS domain name (but
not necessarily the DNS domain name).
Document this behaviour in Documentation/filesystems/nfs/nfsroot.txt,
as it is not made explicit.
Signed-off-by: Chris Novakovic <chris@chrisn.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The original purpose of the per-superblock d_anon list was to
keep disconnected dentries in the cache between consecutive
requests to the NFS server. Dentries can be disconnected if
a client holds a file open and repeatedly performs IO on it,
and if the server drops the dentry, whether due to memory
pressure, server restart, or "echo 3 > /proc/sys/vm/drop_caches".
This purpose was thwarted by commit 75a6f82a0d10 ("freeing unlinked
file indefinitely delayed") which caused disconnected dentries
to be freed as soon as their refcount reached zero.
This means that, when a dentry being used by nfsd gets disconnected, a
new one needs to be allocated for every request (unless requests
overlap). As the dentry has no name, no parent, and no children,
there is little of value to cache. As small memory allocations are
typically fast (from per-cpu free lists) this likely has little cost.
This means that the original purpose of s_anon is no longer relevant:
there is no longer any need to keep disconnected dentries on a list so
they appear to be hashed.
However, s_anon now has a new use. When you mount an NFS filesystem,
the dentry stored in s_root is just a placebo. The "real" root dentry
is allocated using d_obtain_root() and so it kept on the s_anon list.
I don't know the reason for this, but suspect it related to NFSv4
where a mount of "server:/some/path" require NFS to look up the root
filehandle on the server, then walk down "/some" and "/path" to get
the filehandle to mount.
Whatever the reason, NFS depends on the s_anon list and on
shrink_dcache_for_umount() pruning all dentries on this list. So we
cannot simply remove s_anon.
We could just leave the code unchanged, but apart from that being
potentially confusing, the (unfair) bit-spin-lock which protects
s_anon can become a bottle neck when lots of disconnected dentries are
being created.
So this patch renames s_anon to s_roots, and stops storing
disconnected dentries on the list. Only dentries obtained with
d_obtain_root() are now stored on this list. There are many fewer of
these (only NFS and NILFS2 use the call, and only during filesystem
mount) so contention on the bit-lock will not be a problem.
Possibly an alternate solution should be found for NFS and NILFS2, but
that would require understanding their needs first.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Adjusts for ReST markup and moves under keys security devel index.
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable bugfixes:
- Fix use after free in write error path
- Use GFP_NOIO for two allocations in writeback
- Fix a hang in OPEN related to server reboot
- Check the result of nfs4_pnfs_ds_connect
- Fix an rcu lock leak
Features:
- Removal of the unmaintained and unused OSD pNFS layout
- Cleanup and removal of lots of unnecessary dprintk()s
- Cleanup and removal of some memory failure paths now that GFP_NOFS
is guaranteed to never fail.
- Remove the v3-only data server limitation on pNFS/flexfiles
Bugfixes:
- RPC/RDMA connection handling bugfixes
- Copy offload: fixes to ensure the copied data is COMMITed to disk.
- Readdir: switch back to using the ->iterate VFS interface
- File locking fixes from Ben Coddington
- Various use-after-free and deadlock issues in pNFS
- Write path bugfixes"
* tag 'nfs-for-4.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (89 commits)
pNFS/flexfiles: Always attempt to call layoutstats when flexfiles is enabled
NFSv4.1: Work around a Linux server bug...
NFS append COMMIT after synchronous COPY
NFSv4: Fix exclusive create attributes encoding
NFSv4: Fix an rcu lock leak
nfs: use kmap/kunmap directly
NFS: always treat the invocation of nfs_getattr as cache hit when noac is on
Fix nfs_client refcounting if kmalloc fails in nfs4_proc_exchange_id and nfs4_proc_async_renew
NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION
pNFS: Fix NULL dereference in pnfs_generic_alloc_ds_commits
pNFS: Fix a typo in pnfs_generic_alloc_ds_commits
pNFS: Fix a deadlock when coalescing writes and returning the layout
pNFS: Don't clear the layout return info if there are segments to return
pNFS: Ensure we commit the layout if it has been invalidated
pNFS: Don't send COMMITs to the DSes if the server invalidated our layout
pNFS/flexfiles: Fix up the ff_layout_write_pagelist failure path
pNFS: Ensure we check layout validity before marking it for return
NFS4.1 handle interrupted slot reuse from ERR_DELAY
NFSv4: check return value of xdr_inline_decode
nfs/filelayout: fix NULL pointer dereference in fl_pnfs_update_layout()
...
|
|
The objlayout code has been in the tree, but it's been unmaintained and
no server product for it actually ever shipped.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|