diff options
author | Dan Williams <dan.j.williams@intel.com> | 2008-06-27 21:44:04 -0700 |
---|---|---|
committer | Dan Williams <dan.j.williams@intel.com> | 2008-06-30 17:18:19 -0700 |
commit | b5470dc5fc18a8ff6517c3bb538d1479e58ecb02 (patch) | |
tree | 37b0eb3a4691bdbe58dc5c6c73b2dc8d3925b332 /drivers/md/raid5.c | |
parent | 1fe797e67fb07d605b82300934d0de67068a0aca (diff) | |
download | lwn-b5470dc5fc18a8ff6517c3bb538d1479e58ecb02.tar.gz lwn-b5470dc5fc18a8ff6517c3bb538d1479e58ecb02.zip |
md: resolve external metadata handling deadlock in md_allow_write
md_allow_write() marks the metadata dirty while holding mddev->lock and then
waits for the write to complete. For externally managed metadata this causes a
deadlock as userspace needs to take the lock to communicate that the metadata
update has completed.
Change md_allow_write() in the 'external' case to start the 'mark active'
operation and then return -EAGAIN. The expected side effects while waiting for
userspace to write 'active' to 'array_state' are holding off reshape (code
currently handles -ENOMEM), cause some 'stripe_cache_size' change requests to
fail, cause some GET_BITMAP_FILE ioctl requests to fall back to GFP_NOIO, and
cause updates to 'raid_disks' to fail. Except for 'stripe_cache_size' changes
these failures can be mitigated by coordinating with mdmon.
md_write_start() still prevents writes from occurring until the metadata
handler has had a chance to take action as it unconditionally waits for
MD_CHANGE_CLEAN to be cleared.
[neilb@suse.de: return -EAGAIN, try GFP_NOIO]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Diffstat (limited to 'drivers/md/raid5.c')
-rw-r--r-- | drivers/md/raid5.c | 12 |
1 files changed, 9 insertions, 3 deletions
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 442622067cae..8f4c70a53210 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -911,14 +911,16 @@ static int resize_stripes(raid5_conf_t *conf, int newsize) struct stripe_head *osh, *nsh; LIST_HEAD(newstripes); struct disk_info *ndisks; - int err = 0; + int err; struct kmem_cache *sc; int i; if (newsize <= conf->pool_size) return 0; /* never bother to shrink */ - md_allow_write(conf->mddev); + err = md_allow_write(conf->mddev); + if (err) + return err; /* Step 1 */ sc = kmem_cache_create(conf->cache_name[1-conf->active_name], @@ -3843,6 +3845,8 @@ raid5_store_stripe_cache_size(mddev_t *mddev, const char *page, size_t len) { raid5_conf_t *conf = mddev_to_conf(mddev); unsigned long new; + int err; + if (len >= PAGE_SIZE) return -EINVAL; if (!conf) @@ -3858,7 +3862,9 @@ raid5_store_stripe_cache_size(mddev_t *mddev, const char *page, size_t len) else break; } - md_allow_write(mddev); + err = md_allow_write(mddev); + if (err) + return err; while (new > conf->max_nr_stripes) { if (grow_one_stripe(conf)) conf->max_nr_stripes++; |