diff options
author | Brian Foster <bfoster@redhat.com> | 2018-10-18 17:21:49 +1100 |
---|---|---|
committer | Dave Chinner <david@fromorbit.com> | 2018-10-18 17:21:49 +1100 |
commit | efc3289cf8d39c34502a7cc9695ca2fa125aad0c (patch) | |
tree | 10d889728240ab71a728398ed87a02b5c791eb26 /fs/xfs/xfs_buf.c | |
parent | 26ca39015ef210d728df53d66c1ae85e8b48b2f3 (diff) | |
download | lwn-efc3289cf8d39c34502a7cc9695ca2fa125aad0c.tar.gz lwn-efc3289cf8d39c34502a7cc9695ca2fa125aad0c.zip |
xfs: clear ail delwri queued bufs on unmount of shutdown fs
In the typical unmount case, the AIL is forced out by the unmount
sequence before the xfsaild task is stopped. Since AIL items are
removed on writeback completion, this means that the AIL
->ail_buf_list delwri queue has been drained. This is not always
true in the shutdown case, however.
It's possible for buffers to sit on a delwri queue for a period of
time across submission attempts if said items are locked or have
been relogged and pinned since first added to the queue. If the
attempt to log such an item results in a log I/O error, the error
processing can shutdown the fs, remove the item from the AIL, stale
the buffer (dropping the LRU reference) and clear its delwri queue
state. The latter bit means the buffer will be released from a
delwri queue on the next submission attempt, but this might never
occur if the filesystem has shutdown and the AIL is empty.
This means that such buffers are held indefinitely by the AIL delwri
queue across destruction of the AIL. Aside from being a memory leak,
these buffers can also hold references to in-core perag structures.
The latter problem manifests as a generic/475 failure, reproducing
the following asserts at unmount time:
XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0,
file: fs/xfs/xfs_mount.c, line: 151
XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0,
file: fs/xfs/xfs_mount.c, line: 132
To prevent this problem, clear the AIL delwri queue as a final step
before xfsaild() exit. The !empty state should never occur in the
normal case, so add an assert to catch unexpected problems going
forward.
[dgc: add comment explaining need for xfs_buf_delwri_cancel() after
calling xfs_buf_delwri_submit_nowait().]
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Diffstat (limited to 'fs/xfs/xfs_buf.c')
-rw-r--r-- | fs/xfs/xfs_buf.c | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index a372476e265d..b21ea2ba768d 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -2055,6 +2055,13 @@ xfs_buf_delwri_submit_buffers( * is only safely useable for callers that can track I/O completion by higher * level means, e.g. AIL pushing as the @buffer_list is consumed in this * function. + * + * Note: this function will skip buffers it would block on, and in doing so + * leaves them on @buffer_list so they can be retried on a later pass. As such, + * it is up to the caller to ensure that the buffer list is fully submitted or + * cancelled appropriately when they are finished with the list. Failure to + * cancel or resubmit the list until it is empty will result in leaked buffers + * at unmount time. */ int xfs_buf_delwri_submit_nowait( |