diff options
author | Lukas Czerner <lczerner@redhat.com> | 2013-04-09 22:11:22 -0400 |
---|---|---|
committer | Theodore Ts'o <tytso@mit.edu> | 2013-04-09 22:11:22 -0400 |
commit | 27dd43854227bb0e6ab70129bd21b60d396db2e7 (patch) | |
tree | 14e490b9d0ac63583849c4cad4b3ad0123902c3a /fs/ext4/inode.c | |
parent | f45a5ef91bef7e02149a216ed6dc3fcdd8b38268 (diff) | |
download | lwn-27dd43854227bb0e6ab70129bd21b60d396db2e7.tar.gz lwn-27dd43854227bb0e6ab70129bd21b60d396db2e7.zip |
ext4: introduce reserved space
Currently in ENOSPC condition when writing into unwritten space, or
punching a hole, we might need to split the extent and grow extent tree.
However since we can not allocate any new metadata blocks we'll have to
zero out unwritten part of extent or punched out part of extent, or in
the worst case return ENOSPC even though use actually does not allocate
any space.
Also in delalloc path we do reserve metadata and data blocks for the
time we're going to write out, however metadata block reservation is
very tricky especially since we expect that logical connectivity implies
physical connectivity, however that might not be the case and hence we
might end up allocating more metadata blocks than previously reserved.
So in future, metadata reservation checks should be removed since we can
not assure that we do not under reserve.
And this is where reserved space comes into the picture. When mounting
the file system we slice off a little bit of the file system space (2%
or 4096 clusters, whichever is smaller) which can be then used for the
cases mentioned above to prevent costly zeroout, or unexpected ENOSPC.
The number of reserved clusters can be set via sysfs, however it can
never be bigger than number of free clusters in the file system.
Note that this patch fixes the failure of xfstest 274 as expected.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Diffstat (limited to 'fs/ext4/inode.c')
-rw-r--r-- | fs/ext4/inode.c | 11 |
1 files changed, 10 insertions, 1 deletions
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f9b0b479ff4c..629d67b62dfb 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1688,12 +1688,21 @@ static void mpage_da_map_and_submit(struct mpage_da_data *mpd) */ map.m_lblk = next; map.m_len = max_blocks; - get_blocks_flags = EXT4_GET_BLOCKS_CREATE; + /* + * We're in delalloc path and it is possible that we're going to + * need more metadata blocks than previously reserved. However + * we must not fail because we're in writeback and there is + * nothing we can do about it so it might result in data loss. + * So use reserved blocks to allocate metadata if possible. + */ + get_blocks_flags = EXT4_GET_BLOCKS_CREATE | + EXT4_GET_BLOCKS_METADATA_NOFAIL; if (ext4_should_dioread_nolock(mpd->inode)) get_blocks_flags |= EXT4_GET_BLOCKS_IO_CREATE_EXT; if (mpd->b_state & (1 << BH_Delay)) get_blocks_flags |= EXT4_GET_BLOCKS_DELALLOC_RESERVE; + blks = ext4_map_blocks(handle, mpd->inode, &map, get_blocks_flags); if (blks < 0) { struct super_block *sb = mpd->inode->i_sb; |