block: schedule queue restart after BLK_STS_ZONE_RESOURCE

When dispatching a zone append write request to a SCSI zoned block device, if the target zone of the request is already locked, the device driver will return BLK_STS_ZONE_RESOURCE and the request will be pushed back to the hctx dipatch queue. The queue will be marked as RESTART in dd_finish_request() and restarted in __blk_mq_free_request(). However, this restart applies to the hctx of the completed request. If the requeued request is on a different hctx, dispatch will no be retried until another request is submitted or the next periodic queue run triggers, leading to up to 30 seconds latency for the requeued request. Fix this problem by scheduling a queue restart similarly to the BLK_STS_RESOURCE case or when we cannot get the budget. Also, consolidate the checks into the "need_resource" variable to simplify the condition. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Niklas Cassel <Niklas.Cassel@wdc.com> Link: https://lore.kernel.org/r/20211026165127.4151055-1-naohiro.aota@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
author: Naohiro Aota <naohiro.aota@wdc.com> 2021-10-27 01:51:27 +0900
committer: Jens Axboe <axboe@kernel.dk> 2021-10-26 16:00:36 -0600
commit: 9586e67b911c95ba158fcc247b230e9c2d718623 (patch)
tree: 71af4c5b5d89759e42e27dbfef0872056d45a40a /block
parent: d308ae0d299a6bb15be4efb91849582d19c23213 (diff)
download: lwn-9586e67b911c95ba158fcc247b230e9c2d718623.tar.gz
lwn-9586e67b911c95ba158fcc247b230e9c2d718623.zip
1 files changed, 9 insertions, 4 deletions
diff --git a/block/blk-mq.c b/block/blk-mq.c
index bc026372de43..652a31fc3bb3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1325,6 +1325,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
 	int errors, queued;
 	blk_status_t ret = BLK_STS_OK;
 	LIST_HEAD(zone_list);
+	bool needs_resource = false;
 
 	if (list_empty(list))
 		return false;
@@ -1370,6 +1371,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
 			queued++;
 			break;
 		case BLK_STS_RESOURCE:
+			needs_resource = true;
+			fallthrough;
 		case BLK_STS_DEV_RESOURCE:
 			blk_mq_handle_dev_resource(rq, list);
 			goto out;
@@ -1380,6 +1383,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
 			 * accept.
 			 */
 			blk_mq_handle_zone_resource(rq, &zone_list);
+			needs_resource = true;
 			break;
 		default:
 			errors++;
@@ -1406,7 +1410,6 @@ out:
 		/* For non-shared tags, the RESTART check will suffice */
 		bool no_tag = prep == PREP_DISPATCH_NO_TAG &&
 			(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED);
-		bool no_budget_avail = prep == PREP_DISPATCH_NO_BUDGET;
 
 		if (nr_budgets)
 			blk_mq_release_budgets(q, list);
@@ -1447,14 +1450,16 @@ out:
 		 * If driver returns BLK_STS_RESOURCE and SCHED_RESTART
 		 * bit is set, run queue after a delay to avoid IO stalls
 		 * that could otherwise occur if the queue is idle.  We'll do
-		 * similar if we couldn't get budget and SCHED_RESTART is set.
+		 * similar if we couldn't get budget or couldn't lock a zone
+		 * and SCHED_RESTART is set.
 		 */
 		needs_restart = blk_mq_sched_needs_restart(hctx);
+		if (prep == PREP_DISPATCH_NO_BUDGET)
+			needs_resource = true;
 		if (!needs_restart ||
 		    (no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
 			blk_mq_run_hw_queue(hctx, true);
-		else if (needs_restart && (ret == BLK_STS_RESOURCE ||
-					   no_budget_avail))
+		else if (needs_restart && needs_resource)
 			blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY);
 
 		blk_mq_update_dispatch_busy(hctx, true);
author	Naohiro Aota <naohiro.aota@wdc.com>	2021-10-27 01:51:27 +0900
committer	Jens Axboe <axboe@kernel.dk>	2021-10-26 16:00:36 -0600
commit	9586e67b911c95ba158fcc247b230e9c2d718623 (patch)
tree	71af4c5b5d89759e42e27dbfef0872056d45a40a /block
parent	d308ae0d299a6bb15be4efb91849582d19c23213 (diff)
download	lwn-9586e67b911c95ba158fcc247b230e9c2d718623.tar.gz lwn-9586e67b911c95ba158fcc247b230e9c2d718623.zip