diff options
author | Lars Ellenberg <lars.ellenberg@linbit.com> | 2011-11-24 10:36:25 +0100 |
---|---|---|
committer | Philipp Reisner <philipp.reisner@linbit.com> | 2012-11-08 16:58:21 +0100 |
commit | 2312f0b3c5ab794fbac9e9bebe90c784c9d449c5 (patch) | |
tree | 0b4b74af502169d61e026855593b828fa202665a /drivers/block/drbd/drbd_receiver.c | |
parent | f9916d61a40e7ad43c2a20444894f85c45512f91 (diff) | |
download | lwn-2312f0b3c5ab794fbac9e9bebe90c784c9d449c5.tar.gz lwn-2312f0b3c5ab794fbac9e9bebe90c784c9d449c5.zip |
drbd: fix potential deadlock during "restart" of conflicting writes
w_restart_write(), run from worker context, calls __drbd_make_request()
and further drbd_al_begin_io(, delegate=true), which then
potentially deadlocks. The previous patch moved a BUG_ON to expose
such call paths, which would now be triggered.
Also, if we call __drbd_make_request() from resource worker context,
like w_restart_write() did, and that should block for whatever reason
(!drbd_state_is_stable(), resource suspended, ...),
we potentially deadlock the whole resource, as the worker
is needed for state changes and other things.
Create a dedicated retry workqueue for this instead.
Also make sure that inc_ap_bio()/dec_ap_bio() are properly paired,
even if do_retry() needs to retry itself,
in case __drbd_make_request() returns != 0.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Diffstat (limited to 'drivers/block/drbd/drbd_receiver.c')
-rw-r--r-- | drivers/block/drbd/drbd_receiver.c | 32 |
1 files changed, 3 insertions, 29 deletions
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 8a7f61ba74a8..b159ad15abe5 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -1748,30 +1748,6 @@ static int receive_RSDataReply(struct drbd_tconn *tconn, struct packet_info *pi) return err; } -static int w_restart_write(struct drbd_work *w, int cancel) -{ - struct drbd_request *req = container_of(w, struct drbd_request, w); - struct drbd_conf *mdev = w->mdev; - struct bio *bio; - unsigned long start_time; - unsigned long flags; - - spin_lock_irqsave(&mdev->tconn->req_lock, flags); - if (!expect(req->rq_state & RQ_POSTPONED)) { - spin_unlock_irqrestore(&mdev->tconn->req_lock, flags); - return -EIO; - } - bio = req->master_bio; - start_time = req->start_time; - /* Postponed requests will not have their master_bio completed! */ - __req_mod(req, DISCARD_WRITE, NULL); - spin_unlock_irqrestore(&mdev->tconn->req_lock, flags); - - while (__drbd_make_request(mdev, bio, start_time)) - /* retry */ ; - return 0; -} - static void restart_conflicting_writes(struct drbd_conf *mdev, sector_t sector, int size) { @@ -1785,11 +1761,9 @@ static void restart_conflicting_writes(struct drbd_conf *mdev, if (req->rq_state & RQ_LOCAL_PENDING || !(req->rq_state & RQ_POSTPONED)) continue; - if (expect(list_empty(&req->w.list))) { - req->w.mdev = mdev; - req->w.cb = w_restart_write; - drbd_queue_work(&mdev->tconn->data.work, &req->w); - } + /* as it is RQ_POSTPONED, this will cause it to + * be queued on the retry workqueue. */ + __req_mod(req, DISCARD_WRITE, NULL); } } |