summaryrefslogtreecommitdiff
path: root/drivers/mmc/host/mmc_hsq.h
AgeCommit message (Collapse)Author
2023-09-27mmc: hsq: Improve random I/O write performance for 4k buffersWenchao Chen
By dynamically adjusting the host->hsq_depth, based upon the buffer size being 4k and that we get at least two I/O write requests in flight, we can improve the throughput a bit. This is typical for a random I/O write pattern. More precisely, by dynamically changing the number of requests in flight from 2 to 5, we can on some platforms observe ~4-5% increase in throughput. Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com> Link: https://lore.kernel.org/r/20230919074707.25517-3-wenchao.chen@unisoc.com [Ulf: Re-wrote the commitmsg, minor adjustment to the code - all to clarify.] Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-09-27mmc: core: Allow dynamical updates of the number of requests for hsqWenchao Chen
To allow dynamical updates of the current number of used in-flight requests, let's move away from using a hard-coded value to a use a corresponding variable in the struct mmc_host. This can be valuable when optimizing for certain I/O request sequences, as shown by subsequent changes. Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com> Link: https://lore.kernel.org/r/20230919074707.25517-2-wenchao.chen@unisoc.com [Ulf: Re-wrote the commitmsg to clarify the change] Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-12-07mmc: mmc-hsq: Use fifo to dispatch mmc_requestMichael Wu
Current next_tag selection will cause a large delay in some requests and destroy the scheduling results of the block scheduling layer. Because the issued mrq tags cannot ensure that each time is sequential, especially when the IO load is heavy. In the fio performance test, we found that 4k random read data was sent to mmc_hsq to start calling request_atomic It takes nearly 200ms to process the request, while mmc_hsq has processed thousands of other requests. So we use fifo here to ensure the first in, first out feature of the request and avoid adding additional delay to the request. Reviewed-by: Wenchao Chen <wenchao.chen@unisoc.com> Signed-off-by: Michael Wu <michael@allwinnertech.com> Link: https://lore.kernel.org/r/20221128093847.22768-1-michael@allwinnertech.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-05-28mmc: host: Introduce the request_atomic() for the hostBaolin Wang
The SD host controller can process one request in the atomic context if the card is nonremovable, which means we can submit next request in the irq hard handler when using the MMC host software queue to reduce the latency. Thus this patch adds a new API request_atomic() for the host controller, as well as adding support for host software queue to submit a request by the new request_atomic() API. Moreover there is an unusual case that the card is busy when trying to send a command, and we can not polling the card status in interrupt context by using request_atomic() to dispatch requests. Thus we should queue a work to try again in the non-atomic context in case the host releases the busy signal later. Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com> Link: https://lore.kernel.org/r/a344e27e506cb2329073cbd5cf65e15cc3cbeba9.1586744073.git.baolin.wang7@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-03-24mmc: Add MMC host software queue supportBaolin Wang
Now the MMC read/write stack will always wait for previous request is completed by mmc_blk_rw_wait(), before sending a new request to hardware, or queue a work to complete request, that will bring context switching overhead and spend some extra time to poll the card for busy completion for I/O writes via sending CMD13, especially for high I/O per second rates, to affect the IO performance. Thus this patch introduces MMC software queue interface based on the hardware command queue engine's interfaces, which is similar with the hardware command queue engine's idea, that can remove the context switching. Moreover we set the default queue depth as 64 for software queue, which allows more requests to be prepared, merged and inserted into IO scheduler to improve performance, but we only allow 2 requests in flight, that is enough to let the irq handler always trigger the next request without a context switch, as well as avoiding a long latency. Moreover the host controller should support HW busy detection for I/O operations when enabling the host software queue. That means, the host controller must not complete a data transfer request, until after the card stops signals busy. From the fio testing data in cover letter, we can see the software queue can improve some performance with 4K block size, increasing about 16% for random read, increasing about 90% for random write, though no obvious improvement for sequential read and write. Moreover we can expand the software queue interface to support MMC packed request or packed command in future. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com> Link: https://lore.kernel.org/r/4409c1586a9b3ed20d57ad2faf6c262fc3ccb6e2.1581478568.git.baolin.wang7@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>