diff options
author | Jeff Mahoney <jeffm@suse.com> | 2018-03-20 15:25:26 -0400 |
---|---|---|
committer | David Sterba <dsterba@suse.com> | 2018-03-31 01:41:12 +0200 |
commit | 75cb379d2635215ad2c67750693f7dc45ad19a5f (patch) | |
tree | 0e3a1380015be15bbed794330fae015c3ca6a537 /fs/btrfs/volumes.c | |
parent | dc2d3005d27da41247d6c42077e335a777afc79c (diff) | |
download | lwn-75cb379d2635215ad2c67750693f7dc45ad19a5f.tar.gz lwn-75cb379d2635215ad2c67750693f7dc45ad19a5f.zip |
btrfs: defer adding raid type kobject until after chunk relocation
Any time the first block group of a new type is created, we add a new
kobject to sysfs to hold the attributes for that type. Kobject-internal
allocations always use GFP_KERNEL, making them prone to fs-reclaim races.
While it appears as if this can occur any time a block group is created,
the only times the first block group of a new type can be created in
memory is at mount and when we create the first new block group during
raid conversion.
This patch adds a new list to track pending kobject additions and then
handles them after we do chunk relocation. Between relocating the
target chunk (or forcing allocation of a new chunk in the case of data)
and removing the old chunk, we're in a safe place for fs-reclaim to
occur. We're holding the volume mutex, which is already held across
page faults, and the delete_unused_bgs_mutex, which will only stall
the cleaner thread.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs/btrfs/volumes.c')
-rw-r--r-- | fs/btrfs/volumes.c | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 73de042158f1..4fc6acf65220 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3003,6 +3003,16 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset) if (ret) return ret; + /* + * We add the kobjects here (and after forcing data chunk creation) + * since relocation is the only place we'll create chunks of a new + * type at runtime. The only place where we'll remove the last + * chunk of a type is the call immediately below this one. Even + * so, we're protected against races with the cleaner thread since + * we're covered by the delete_unused_bgs_mutex. + */ + btrfs_add_raid_kobjects(fs_info); + trans = btrfs_start_trans_remove_block_group(root->fs_info, chunk_offset); if (IS_ERR(trans)) { @@ -3130,6 +3140,8 @@ static int btrfs_may_alloc_data_chunk(struct btrfs_fs_info *fs_info, if (ret < 0) return ret; + btrfs_add_raid_kobjects(fs_info); + return 1; } } |