[PATCH] dm: work around mempool_alloc, bio_alloc_bioset deadlocks

This patch works around a complex dm-related deadlock/livelock down in the mempool allocator. Alasdair said: Several dm targets suffer from this. Mempools are not yet used correctly everywhere in device-mapper: they can get shared when devices are stacked, and some targets share them across multiple instances. I made fixing this one of the prerequisites for this patch: md-dm-reduce-stack-usage-with-stacked-block-devices.patch which in some cases makes people more likely to hit the problem. There's been some progress on this recently with (unfinished) dm-crypt patches at: http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/ (dm-crypt-move-io-to-workqueue.patch plus dependencies) and: I've no problems with a temporary workaround like that, but Milan Broz (a new Redhat developer in the Czech Republic) has started reviewing all the mempool usage in device-mapper so I'm expecting we'll soon have a proper fix for this associated problems. [He's back from holiday at the start of next week.] For now, this sad-but-safe little patch will allow the machine to recover. [akpm@osdl.org: rewrote changelog] Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
author: Pavel Mironchik <tibor0@gmail.com> 2006-08-31 21:27:47 -0700
committer: Linus Torvalds <torvalds@g5.osdl.org> 2006-09-01 11:39:09 -0700
commit: 0b1d647a02c5a1b67d45287eeb6cb3b2219c41c3 (patch)
tree: 13f9caa7b0ebd17dff481f854ac8803aae01234f
parent: 1e5f5e5cd65eec6ce5c24a9c29f3e52673b121a6 (diff)
download: lwn-0b1d647a02c5a1b67d45287eeb6cb3b2219c41c3.tar.gz
lwn-0b1d647a02c5a1b67d45287eeb6cb3b2219c41c3.zip
1 files changed, 7 insertions, 2 deletions
diff --git a/mm/mempool.c b/mm/mempool.c
index fe6e05289cc5..ccd8cb8cd41f 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -238,8 +238,13 @@ repeat_alloc:
 	init_wait(&wait);
 	prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE);
 	smp_mb();
-	if (!pool->curr_nr)
-		io_schedule();
+	if (!pool->curr_nr) {
+		/*
+		 * FIXME: this should be io_schedule().  The timeout is there
+		 * as a workaround for some DM problems in 2.6.18.
+		 */
+		io_schedule_timeout(5*HZ);
+	}
 	finish_wait(&pool->wait, &wait);
 
 	goto repeat_alloc;
author	Pavel Mironchik <tibor0@gmail.com>	2006-08-31 21:27:47 -0700
committer	Linus Torvalds <torvalds@g5.osdl.org>	2006-09-01 11:39:09 -0700
commit	0b1d647a02c5a1b67d45287eeb6cb3b2219c41c3 (patch)
tree	13f9caa7b0ebd17dff481f854ac8803aae01234f
parent	1e5f5e5cd65eec6ce5c24a9c29f3e52673b121a6 (diff)
download	lwn-0b1d647a02c5a1b67d45287eeb6cb3b2219c41c3.tar.gz lwn-0b1d647a02c5a1b67d45287eeb6cb3b2219c41c3.zip