diff options
author | NeilBrown <neilb@suse.com> | 2017-04-10 12:22:09 +1000 |
---|---|---|
committer | Trond Myklebust <trond.myklebust@primarydata.com> | 2017-04-20 13:44:05 -0400 |
commit | 518662e0fcb9fa241fe90a337b59bc5066b2a930 (patch) | |
tree | 828d198fc53348eb87734bd689abe12cb0679287 /fs/nfs/pnfs_nfs.c | |
parent | f6148713b24107c478ce532a4be8e5c58437c80e (diff) | |
download | lwn-518662e0fcb9fa241fe90a337b59bc5066b2a930.tar.gz lwn-518662e0fcb9fa241fe90a337b59bc5066b2a930.zip |
NFS: fix usage of mempools.
When passed GFP flags that allow sleeping (such as
GFP_NOIO), mempool_alloc() will never return NULL, it will
wait until memory is available.
This means that we don't need to handle failure, but that we
do need to ensure one thread doesn't call mempool_alloc()
twice on the one pool without queuing or freeing the first
allocation. If multiple threads did this during times of
high memory pressure, the pool could be exhausted and a
deadlock could result.
pnfs_generic_alloc_ds_commits() attempts to allocate from
the nfs_commit_mempool while already holding an allocation
from that pool. This is not safe. So change
nfs_commitdata_alloc() to take a flag that indicates whether
failure is acceptable.
In pnfs_generic_alloc_ds_commits(), accept failure and
handle it as we currently do. Else where, do not accept
failure, and do not handle it.
Even when failure is acceptable, we want to succeed if
possible. That means both
- using an entry from the pool if there is one
- waiting for direct reclaim is there isn't.
We call mempool_alloc(GFP_NOWAIT) to achieve the first, then
kmem_cache_alloc(GFP_NOIO|__GFP_NORETRY) to achieve the
second. Each of these can fail, but together they do the
best they can without blocking indefinitely.
The objects returned by kmem_cache_alloc() will still be freed
by mempool_free(). This is safe as mempool_alloc() uses
exactly the same function to allocate objects (since the mempool
was created with mempool_create_slab_pool()). The object returned
by mempool_alloc() and kmem_cache_alloc() are indistinguishable
so mempool_free() will handle both identically, either adding to the
pool or calling kmem_cache_free().
Also, don't test for failure when allocating from
nfs_wdata_mempool.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Diffstat (limited to 'fs/nfs/pnfs_nfs.c')
-rw-r--r-- | fs/nfs/pnfs_nfs.c | 16 |
1 files changed, 5 insertions, 11 deletions
diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c index 7250b95549ec..1edf5b84aba5 100644 --- a/fs/nfs/pnfs_nfs.c +++ b/fs/nfs/pnfs_nfs.c @@ -217,7 +217,7 @@ pnfs_generic_alloc_ds_commits(struct nfs_commit_info *cinfo, for (i = 0; i < fl_cinfo->nbuckets; i++, bucket++) { if (list_empty(&bucket->committing)) continue; - data = nfs_commitdata_alloc(); + data = nfs_commitdata_alloc(false); if (!data) break; data->ds_commit_index = i; @@ -283,16 +283,10 @@ pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, unsigned int nreq = 0; if (!list_empty(mds_pages)) { - data = nfs_commitdata_alloc(); - if (data != NULL) { - data->ds_commit_index = -1; - list_add(&data->pages, &list); - nreq++; - } else { - nfs_retry_commit(mds_pages, NULL, cinfo, 0); - pnfs_generic_retry_commit(cinfo, 0); - return -ENOMEM; - } + data = nfs_commitdata_alloc(true); + data->ds_commit_index = -1; + list_add(&data->pages, &list); + nreq++; } nreq += pnfs_generic_alloc_ds_commits(cinfo, &list); |