md/raid1: set max_sectors during early return from choose_slow_rdev()

Linux 6.9+ is unable to start a degraded RAID1 array with one drive, when that drive has a write-mostly flag set. During such an attempt, the following assertion in bio_split() is hit: BUG_ON(sectors <= 0); Call Trace: ? bio_split+0x96/0xb0 ? exc_invalid_op+0x53/0x70 ? bio_split+0x96/0xb0 ? asm_exc_invalid_op+0x1b/0x20 ? bio_split+0x96/0xb0 ? raid1_read_request+0x890/0xd20 ? __call_rcu_common.constprop.0+0x97/0x260 raid1_make_request+0x81/0xce0 ? __get_random_u32_below+0x17/0x70 ? new_slab+0x2b3/0x580 md_handle_request+0x77/0x210 md_submit_bio+0x62/0xa0 __submit_bio+0x17b/0x230 submit_bio_noacct_nocheck+0x18e/0x3c0 submit_bio_noacct+0x244/0x670 After investigation, it turned out that choose_slow_rdev() does not set the value of max_sectors in some cases and because of it, raid1_read_request calls bio_split with sectors == 0. Fix it by filling in this variable. This bug was introduced in commit dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()") but apparently hidden until commit 0091c5a269ec ("md/raid1: factor out helpers to choose the best rdev from read_balance()") shortly thereafter. Cc: stable@vger.kernel.org # 6.9.x+ Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Fixes: dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()") Cc: Song Liu <song@kernel.org> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Paul Luse <paul.e.luse@linux.intel.com> Cc: Xiao Ni <xni@redhat.com> Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/linux-raid/20240706143038.7253-1-mat.jonczyk@o2.pl/ -- Tested on both Linux 6.10 and 6.9.8. Inside a VM, mdadm testsuite for RAID1 on 6.10 did not find any problems: ./test --dev=loop --no-error --raidtype=raid1 (on 6.9.8 there was one failure, caused by external bitmap support not compiled in). Notes: - I was reliably getting deadlocks when adding / removing devices on such an array - while the array was loaded with fsstress with 20 concurrent processes. When the array was idle or loaded with fsstress with 8 processes, no such deadlocks happened in my tests. This occurred also on unpatched Linux 6.8.0 though, but not on 6.1.97-rc1, so this is likely an independent regression (to be investigated). - I was also getting deadlocks when adding / removing the bitmap on the array in similar conditions - this happened on Linux 6.1.97-rc1 also though. fsstress with 8 concurrent processes did cause it only once during many tests. - in my testing, there was once a problem with hot adding an internal bitmap to the array: mdadm: Cannot add bitmap while array is resyncing or reshaping etc. mdadm: failed to set internal bitmap. even though no such reshaping was happening according to /proc/mdstat. This seems unrelated, though. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240711202316.10775-1-mat.jonczyk@o2.pl
author: Mateusz Jończyk <mat.jonczyk@o2.pl> 2024-07-11 22:23:16 +0200
committer: Song Liu <song@kernel.org> 2024-07-12 01:30:38 +0000
commit: 36a5c03f232719eb4e2d925f4d584e09cfaf372c (patch)
tree: 7a8f4010c55ed7c0898398401c71fb4fdc0457a2 /drivers/md
parent: 35a0a409fa269c287c4378f1aefe84ae8b5211a1 (diff)
download: lwn-36a5c03f232719eb4e2d925f4d584e09cfaf372c.tar.gz
lwn-36a5c03f232719eb4e2d925f4d584e09cfaf372c.zip
1 files changed, 1 insertions, 0 deletions
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 04a0c2ca1732..7acfe7c9dc8d 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -680,6 +680,7 @@ static int choose_slow_rdev(struct r1conf *conf, struct r1bio *r1_bio,
 		len = r1_bio->sectors;
 		read_len = raid1_check_read_range(rdev, this_sector, &len);
 		if (read_len == r1_bio->sectors) {
+			*max_sectors = read_len;
 			update_read_sectors(conf, disk, this_sector, read_len);
 			return disk;
 		}
author	Mateusz Jończyk <mat.jonczyk@o2.pl>	2024-07-11 22:23:16 +0200
committer	Song Liu <song@kernel.org>	2024-07-12 01:30:38 +0000
commit	36a5c03f232719eb4e2d925f4d584e09cfaf372c (patch)
tree	7a8f4010c55ed7c0898398401c71fb4fdc0457a2 /drivers/md
parent	35a0a409fa269c287c4378f1aefe84ae8b5211a1 (diff)
download	lwn-36a5c03f232719eb4e2d925f4d584e09cfaf372c.tar.gz lwn-36a5c03f232719eb4e2d925f4d584e09cfaf372c.zip