lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2022-12-14	scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery	Peter Wang
	When SSU/enter hibern8 fail in WLUN suspend flow, trigger the error handler and return busy to break the suspend. Otherwise the consumer will get stuck in runtime suspend status. Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun") Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20221208072520.26210-1-peter.wang@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-14	scsi: scsi_debug: Delete unreachable code in inquiry_vpd_b0()	John Garry
	The 2nd return statement in inquiry_vpd_b0() is unreachable, so delete it. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20221213142122.1011886-1-john.g.garry@oracle.com Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-14	scsi: mpi3mr: Refer CONFIG_SCSI_MPI3MR in Makefile	Shin'ichiro Kawasaki
	When Kconfig item CONFIG_SCSI_MPI3MR was introduced for mpi3mr driver, the Makefile of the driver was not modified to refer the Kconfig item. As a result, mpi3mr.ko is built regardless of the Kconfig item value y or m. Also, if 'make localmodconfig' can not find the Kconfig item in the Makefile, then it does not generate CONFIG_SCSI_MPI3MR=m even when mpi3mr.ko is loaded on the system. Refer to the Kconfig item to avoid the issues. Fixes: c4f7ac64616e ("scsi: mpi3mr: Add mpi30 Rev-R headers and Kconfig") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20221207023659.2411785-1-shinichiro.kawasaki@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-14	scsi: core: scsi_error: Do not queue pointless abort workqueue functions	Hannes Reinecke
	If a host template doesn't implement the .eh_abort_handler() there is no point in queueing the abort workqueue function; all it does is invoking SCSI EH anyway. So return 'FAILED' from scsi_abort_command() if the .eh_abort_handler() is not implemented and save us from having to wait for the abort workqueue function to complete. Cc: Niklas Cassel <niklas.cassel@wdc.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> [niklas: moved the check to the top of scsi_abort_command()] Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20221206131346.2045375-1-niklas.cassel@wdc.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-14	scsi: storvsc: Fix swiotlb bounce buffer leak in confidential VM	Michael Kelley
	storvsc_queuecommand() maps the scatter/gather list using scsi_dma_map(), which in a confidential VM allocates swiotlb bounce buffers. If the I/O submission fails in storvsc_do_io(), the I/O is typically retried by higher level code, but the bounce buffer memory is never freed. The mostly like cause of I/O submission failure is a full VMBus channel ring buffer, which is not uncommon under high I/O loads. Eventually enough bounce buffer memory leaks that the confidential VM can't do any I/O. The same problem can arise in a non-confidential VM with kernel boot parameter swiotlb=force. Fix this by doing scsi_dma_unmap() in the case of an I/O submission error, which frees the bounce buffer memory. Fixes: 743b237c3a7b ("scsi: storvsc: Add Isolation VM support for storvsc driver") Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1670183564-76254-1-git-send-email-mikelley@microsoft.com Tested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-14	scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace	Wenchao Hao
	It was observed that the kernel would potentially send ISCSI_KEVENT_UNBIND_SESSION multiple times. Introduce 'target_state' in iscsi_cls_session() to make sure session will send only one unbind session event. This introduces a regression wrt. the issue fixed in commit 13e60d3ba287 ("scsi: iscsi: Report unbind session event when the target has been removed"). If iscsid dies for any reason after sending an unbind session to kernel, once iscsid is restarted, the kernel's ISCSI_KEVENT_UNBIND_SESSION event is lost and userspace is then unable to logout. However, the session is actually in invalid state (its target_id is INVALID) so iscsid should not sync this session during restart. Consequently we need to check the session's target state during iscsid restart. If session is in unbound state, do not sync this session and perform session teardown. This is OK because once a session is unbound, we can not recover it any more (mainly because its target id is INVALID). Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221126010752.231917-1-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-01	scsi: qla2xxx: Fix crash when I/O abort times out	Arun Easi
	While performing CPU hotplug, a crash with the following stack was seen: Call Trace: qla24xx_process_response_queue+0x42a/0x970 [qla2xxx] qla2x00_start_nvme_mq+0x3a2/0x4b0 [qla2xxx] qla_nvme_post_cmd+0x166/0x240 [qla2xxx] nvme_fc_start_fcp_op.part.0+0x119/0x2e0 [nvme_fc] blk_mq_dispatch_rq_list+0x17b/0x610 __blk_mq_sched_dispatch_requests+0xb0/0x140 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x35/0x90 __blk_mq_delay_run_hw_queue+0x161/0x180 blk_execute_rq+0xbe/0x160 __nvme_submit_sync_cmd+0x16f/0x220 [nvme_core] nvmf_connect_admin_queue+0x11a/0x170 [nvme_fabrics] nvme_fc_create_association.cold+0x50/0x3dc [nvme_fc] nvme_fc_connect_ctrl_work+0x19/0x30 [nvme_fc] process_one_work+0x1e8/0x3c0 On abort timeout, completion was called without checking if the I/O was already completed. Verify that I/O and abort request are indeed outstanding before attempting completion. Fixes: 71c80b75ce8f ("scsi: qla2xxx: Do command completion on abort timeout") Reported-by: Marco Patalano <mpatalan@redhat.com> Tested-by: Marco Patalano <mpatalan@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20221129092634.15347-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-01	nvme: Convert NVMe errors to PR errors	Mike Christie
	This converts the NVMe errors we commonly see during PR handling to PR_STS errors or -Exyz errors. pr_ops callers can then handle SCSI and NVMe errors without knowing the device types. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20221122032603.32766-5-michael.christie@oracle.com Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-01	scsi: sd: Convert SCSI errors to PR errors	Mike Christie
	This converts the SCSI errors we commonly see during PR handling to PR_STS errors or -Exyz errors. pr_ops callers can then handle SCSI and NVMe errors without knowing the device types. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20221122032603.32766-4-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-01	scsi: core: Rename status_byte to sg_status_byte	Mike Christie
	The next patch adds a helper status_byte function that works like host_byte, so this patch renames the old status_byte to sg_status_byte since it's only used for SG IO. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20221122032603.32766-3-michael.christie@oracle.com Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-01	block: Add error codes for common PR failures	Mike Christie
	If a PR operation fails we can return a device-specific error which is impossible to handle in some cases because we could have a mix of devices when DM is used, or future users like LIO only knows it's interacting with a block device so it doesn't know the type. This patch adds a new pr_status enum so drivers can convert errors to a common type which can be handled by the caller. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20221122032603.32766-2-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-01	scsi: sd: sd_zbc: Trace zone append emulation	Johannes Thumshirn
	Add tracepoints to the SCSI zone append emulation in order to trace the zone start to write-pointer aligned LBA translation and the corresponding completion. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/d103bcf5f90139143469f2a0084c74bd9e03ad4a.1669804487.git.johannes.thumshirn@wdc.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-01	scsi: libfc: Include the correct header	Christophe JAILLET
	This file does not use rcu, so there is no point in including <linux/rculist.h>. The dependency has been removed in commit fa519f701d27 ("scsi: libfc: fixup 'sleeping function called from invalid context'") It turned a list_for_each_entry_rcu() into a list_for_each_entry(). So just #include <linux/list.h> now. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/960f34418358f0c35e645aa2cf7e0ec7fe6b60b9.1669461197.git.christophe.jaillet@wanadoo.fr Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: sg: Fix get_user() in call sg_scsi_ioctl()	Kirill A. Shutemov
	get_user() expects the pointer to be pointer-to-simple-variable type, but sic->data is array of 'unsigned char'. It violates get_user() contracts. Explicitly take pointer to the first element of the array. It matches current behaviour. This is preparation for fixing sparse warnings caused by Linear Address Masking patchset. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20221117232304.1544-1-kirill.shutemov@linux.intel.com Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: megaraid_sas: Fix some spelling mistakes in comment	Yu Zhe
	Fix typos in comment. Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Link: https://lore.kernel.org/r/20221125020703.22216-1-yuzhe@nfschina.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: core: Use SCSI_SCAN_INITIAL in do_scsi_scan_host()	John Garry
	Instead of using hardcoded '0' as the do_scsi_scan_host() -> scsi_scan_host_selected() rescan arg, use proper macro SCSI_SCAN_INITIAL. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20221121121725.1910795-3-john.g.garry@oracle.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: core: Use SCSI_SCAN_RESCAN in __scsi_add_device()	John Garry
	Instead of using hardcoded '1' as the __scsi_add_device() -> scsi_probe_and_add_lun() rescan arg, use proper macro SCSI_SCAN_RESCAN. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20221121121725.1910795-2-john.g.garry@oracle.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: ufs: ufs-mediatek: Remove unnecessary return code	ChanWoo Lee
	Modify to remove unnecessary 'return 0' code. Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20221121003338.11034-1-cw9316.lee@samsung.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: ufs: core: Fix the polling implementation	Bart Van Assche
	Fix the following issues in ufshcd_poll(): - If polling succeeds, return a positive value. - Do not complete polling requests from interrupt context because the block layer expects these requests to be completed from thread context. From block/bio.c: If REQ_ALLOC_CACHE is set, the final put of the bio MUST be done from process context, not hard/soft IRQ. Fixes: eaab9b573054 ("scsi: ufs: Implement polling support") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221118233717.441298-1-bvanassche@acm.org Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: libsas: Do not export sas_ata_wait_after_reset()	Jie Zhan
	sas_ata_wait_after_reset() does not need to be exported since it is no longer referenced outside libsas. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-6-zhanjie9@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset	Jie Zhan
	SATA devices on an expander may be removed and not be found again when I_T nexus reset and revalidation are processed simultaneously. The issue comes from: - Revalidation can remove SATA devices in link reset, e.g. in hisi_sas_clear_nexus_ha(). - However, hisi_sas_debug_I_T_nexus_reset() polls the state of a SATA device on an expander after sending link_reset, where it calls: hisi_sas_debug_I_T_nexus_reset sas_ata_wait_after_reset ata_wait_after_reset ata_wait_ready smp_ata_check_ready sas_ex_phy_discover sas_ex_phy_discover_helper sas_set_ex_phy The ex_phy's change count is updated in sas_set_ex_phy(), so SATA devices after a link reset may not be found later through revalidation. A similar issue was reported in: commit 0f3fce5cc77e ("[SCSI] libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready") commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling"). To address this issue, in hisi_sas_debug_I_T_nexus_reset(), we now call smp_ata_check_ready_type() that only polls the device type while not updating the ex_phy's data of libsas. Fixes: 71453bd9d1bf ("scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus reset") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-5-zhanjie9@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: libsas: Add smp_ata_check_ready_type()	Jie Zhan
	Create function smp_ata_check_ready_type() for LLDDs to wait for SATA devices to come up after a link reset. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-4-zhanjie9@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: Revert "scsi: hisi_sas: Don't send bcast events from HW during nexus ↵	Jie Zhan
	HA reset" This reverts commit f5f2a2716055ad8c0c4ff83e51d667646c6c5d8a. This is now unnecessary to solve the SATA devices missing issue in hisi_sas_clear_nexus_ha(). Hence, we should not ignore bcast events during sas_eh_handle_sas_errors() in case of missing bcast events, unless a justified need is found and a mechanism to defer (but not ignore) bcast events in sas_eh_handle_sas_errors() is provided. Also, in hisi_sas_clear_nexus_ha(), there is nothing further to handle in "out: " other than return, so that part can be reverted. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-3-zhanjie9@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: Revert "scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology()"	Jie Zhan
	This reverts commit 11ff0c98fca35df16c84d4eee52008faecaf10a6. Draining or flushing events in hisi_sas_rescan_topology() can hang the driver, typically with phy up or phy down events being processed, i.e. sas_porte_bytes_dmaed() or sas_phye_loss_of_signal(). Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-2-zhanjie9@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: ufs: ufs-mediatek: Modify the return value	ChanWoo Lee
	Be consistent with the rest of driver wrt. functions returning bool. 91: return !!(host->caps & UFS_MTK_CAP_BOOST_CRYPT_ENGINE); 98: return !!(host->caps & UFS_MTK_CAP_VA09_PWR_CTRL); 105: return !!(host->caps & UFS_MTK_CAP_BROKEN_VCC); Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20221118045242.2770-1-cw9316.lee@samsung.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: ufs: ufs-mediatek: Remove unneeded code	ChanWoo Lee
	Remove unnecessary if/goto code. Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20221118044136.921-1-cw9316.lee@samsung.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: device_handler: alua: Call scsi_device_put() from non-atomic context	Bart Van Assche
	Since commit f93ed747e2c7 ("scsi: core: Release SCSI devices synchronously"), scsi_device_put() might sleep. Avoid calling it from alua_rtpg_queue() with the pg_lock held. The lock only pretects h->pg, anyway. To avoid the pg being freed under us, because of a race with another thread, take a temporary reference. In alua_rtpg_queue(), verify that the pg still belongs to the sdev being passed before actually queueing the RTPG. This patch fixes the following smatch warning: drivers/scsi/device_handler/scsi_dh_alua.c:1013 alua_rtpg_queue() warn: sleeping in atomic context alua_check_vpd() <- disables preempt -> alua_rtpg_queue() -> scsi_device_put() Cc: Martin Wilck <mwilck@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Sachin Sant <sachinp@linux.ibm.com> Cc: Benjamin Block <bblock@linux.ibm.com> Suggested-by: Martin Wilck <mwilck@suse.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221117183626.2656196-3-bvanassche@acm.org Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: device_handler: alua: Revert "Move a scsi_device_put() call out of ↵	Bart Van Assche
	alua_check_vpd()" There is a bug in commit 0b25e17e9018 ("scsi: alua: Move a scsi_device_put() call out of alua_check_vpd()"): that patch may cause alua_rtpg_queue() callers to call scsi_device_put() even if that function should not be called. Revert that commit to prepare for a different solution. Cc: Hannes Reinecke <hare@suse.de> Cc: Martin Wilck <mwilck@suse.com> Cc: Sachin Sant <sachinp@linux.ibm.com> Cc: Benjamin Block <bblock@linux.ibm.com> Reported-by: Sachin Sant <sachinp@linux.ibm.com> Reported-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221117183626.2656196-2-bvanassche@acm.org Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: snic: Fix possible UAF in snic_tgt_create()	Gaosheng Cui
	Smatch reports a warning as follows: drivers/scsi/snic/snic_disc.c:307 snic_tgt_create() warn: '&tgt->list' not removed from list If device_add() fails in snic_tgt_create(), tgt will be freed, but tgt->list will not be removed from snic->disc.tgt_list, then list traversal may cause UAF. Remove from snic->disc.tgt_list before free(). Fixes: c8806b6c9e82 ("snic: driver for Cisco SCSI HBA") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Link: https://lore.kernel.org/r/20221117035100.2944812-1-cuigaosheng1@huawei.com Acked-by: Narsimhulu Musini <nmusini@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: qla2xxx: Initialize vha->unknown_atio_[list, work] for NPIV hosts	Gleb Chesnokov
	Initialization of vha->unknown_atio_list and vha->unknown_atio_work only happens for base_vha in qlt_probe_one_stage1(). But there is no initialization for NPIV hosts that are created in qla24xx_vport_create(). This causes a crash when trying to access these NPIV host fields. Fix this by adding initialization to qla_vport_create(). Signed-off-by: Gleb Chesnokov <gleb.chesnokov@scst.dev> Link: https://lore.kernel.org/r/376c89a2-a9ac-bcf9-bf0f-dfe89a02fd4b@scst.dev Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: qla2xxx: Remove duplicate of vha->iocb_work initialization	Gleb Chesnokov
	Commit 9b3e0f4d4147 ("scsi: qla2xxx: Move work element processing out of DPC thread") introduced the initialization of vha->iocb_work in qla2x00_create_host() function. This initialization is also called from qla2x00_probe_one() function, just after qla2x00_create_host(). Hence remove this duplicate call since it has already been called before. Signed-off-by: Gleb Chesnokov <gleb.chesnokov@scst.dev> Link: https://lore.kernel.org/r/822b3823-f344-67d6-30f1-16e31cf68eed@scst.dev Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails	Chen Zhongjin
	fcoe_init() calls fcoe_transport_attach(&fcoe_sw_transport), but when fcoe_if_init() fails, &fcoe_sw_transport is not detached and leaves freed &fcoe_sw_transport on fcoe_transports list. This causes panic when reinserting module. BUG: unable to handle page fault for address: fffffbfff82e2213 RIP: 0010:fcoe_transport_attach+0xe1/0x230 [libfcoe] Call Trace: <TASK> do_one_initcall+0xd0/0x4e0 load_module+0x5eee/0x7210 ... Fixes: 78a582463c1e ("[SCSI] fcoe: convert fcoe.ko to become an fcoe transport provider driver") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221115092442.133088-1-chenzhongjin@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: sd: Use 16-byte SYNCHRONIZE CACHE on ZBC devices	Shin'ichiro Kawasaki
	ZBC Zoned Block Commands specification mandates SYNCHRONIZE CACHE(16) for host-managed zoned block devices, but does not mandate SYNCHRONIZE CACHE(10). Call SYNCHRONIZE CACHE(16) in place of SYNCHRONIZE CACHE(10) to ensure that the command is always supported. For this purpose, add use_16_for_sync flag to struct scsi_device in same manner as use_16_for_rw flag. To be precise, ZBC does not mandate SYNCHRONIZE CACHE(16) for host-aware zoned block devices. However, modern devices should support 16-byte commands. Hence, call SYNCHRONIZE CACHE (16) on both types of ZBC devices, host-aware and host-managed. Of note is that READ(16) and WRITE(16) have same story and they are already called for both types of ZBC devices. Another note is that this patch depends on the fix commit ea045fd344cb ("ata: libata-scsi: fix SYNCHRONIZE CACHE (16) command failure"). Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20221115002905.1709006-1-shinichiro.kawasaki@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@opendource.wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: ipr: Fix WARNING in ipr_init()	Shang XiaoJing
	ipr_init() will not call unregister_reboot_notifier() when pci_register_driver() fails, which causes a WARNING. Call unregister_reboot_notifier() when pci_register_driver() fails. notifier callback ipr_halt [ipr] already registered WARNING: CPU: 3 PID: 299 at kernel/notifier.c:29 notifier_chain_register+0x16d/0x230 Modules linked in: ipr(+) xhci_pci_renesas xhci_hcd ehci_hcd usbcore led_class gpu_sched drm_buddy video wmi drm_ttm_helper ttm drm_display_helper drm_kms_helper drm drm_panel_orientation_quirks agpgart cfbft CPU: 3 PID: 299 Comm: modprobe Tainted: G W 6.1.0-rc1-00190-g39508d23b672-dirty #332 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:notifier_chain_register+0x16d/0x230 Call Trace: <TASK> __blocking_notifier_chain_register+0x73/0xb0 ipr_init+0x30/0x1000 [ipr] do_one_initcall+0xdb/0x480 do_init_module+0x1cf/0x680 load_module+0x6a50/0x70a0 __do_sys_finit_module+0x12f/0x1c0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: f72919ec2bbb ("[SCSI] ipr: implement shutdown changes and remove obsolete write cache parameter") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20221113064513.14028-1-shangxiaojing@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: scsi_debug: Fix possible name leak in sdebug_add_host_helper()	Yang Yingliang
	Afer commit 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array"), the name of device is allocated dynamically, it needs be freed when device_register() returns error. As comment of device_register() says, one should use put_device() to give up the reference in the error path. Fix this by calling put_device(), then the name can be freed in kobject_cleanup(), and sdbg_host is freed in sdebug_release_adapter(). When the device release is not set, it means the device is not initialized. We can not call put_device() in this case. Use kfree() to free memory. Fixes: 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221112131010.3757845-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-26	scsi: fcoe: Fix possible name leak when device_register() fails	Yang Yingliang
	If device_register() returns an error, the name allocated by dev_set_name() needs to be freed. As the comment of device_register() says, one should use put_device() to give up the reference in the error path. Fix this by calling put_device(), then the name can be freed in kobject_cleanup(). The 'fcf' is freed in fcoe_fcf_device_release(), so the kfree() in the error path can be removed. The 'ctlr' is freed in fcoe_ctlr_device_release(), so don't use the error label, just return NULL after calling put_device(). Fixes: 9a74e884ee71 ("[SCSI] libfcoe: Add fcoe_sysfs") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221112094310.3633291-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: scsi_debug: Fix a warning in resp_report_zones()	Harshit Mogalapalli
	As 'alloc_len' is user controlled data, if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: 7db0e0c8190a ("scsi: scsi_debug: Fix buffer size of REPORT ZONES command") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070612.2121535-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: scsi_debug: Fix a warning in resp_verify()	Harshit Mogalapalli
	As 'vnum' is controlled by user, so if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: c3e2fe9222d4 ("scsi: scsi_debug: Implement VERIFY(10), add VERIFY(16)") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070031.2121068-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: efct: Fix possible memleak in efct_device_init()	Chen Zhongjin
	In efct_device_init(), when efct_scsi_reg_fc_transport() fails, efct_scsi_tgt_driver_exit() is not called to release memory for efct_scsi_tgt_driver_init() and causes memleak: unreferenced object 0xffff8881020ce000 (size 2048): comm "modprobe", pid 465, jiffies 4294928222 (age 55.872s) backtrace: [<0000000021a1ef1b>] kmalloc_trace+0x27/0x110 [<000000004c3ed51c>] target_register_template+0x4fd/0x7b0 [target_core_mod] [<00000000f3393296>] efct_scsi_tgt_driver_init+0x18/0x50 [efct] [<00000000115de533>] 0xffffffffc0d90011 [<00000000d608f646>] do_one_initcall+0xd0/0x4e0 [<0000000067828cf1>] do_init_module+0x1cc/0x6a0 ... Fixes: 4df84e846624 ("scsi: elx: efct: Driver initialization routines") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221111074046.57061-1-chenzhongjin@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: ufs: core: Fix unnecessary operation for early return	ChanWoo Lee
	Setting bitmap_len is not required when returning early. Defer until it is needed. Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20221111062301.7423-1-cw9316.lee@samsung.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: ufs: core: Switch 'check_for_bkops' to bool	ChanWoo Lee
	Only checks true and false so it can be converted to bool. Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20221111062209.7365-1-cw9316.lee@samsung.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: ufs: core: Separate function name and message	ChanWoo Lee
	Separate the function name and message to make it easier to check the log. Modify messages to fit the format of others. Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20221111062126.7307-1-cw9316.lee@samsung.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: hpsa: Fix possible memory leak in hpsa_add_sas_device()	Yang Yingliang
	If hpsa_sas_port_add_rphy() returns an error, the 'rphy' allocated in sas_end_device_alloc() needs to be freed. Address this by calling sas_rphy_free() in the error path. Fixes: d04e62b9d63a ("hpsa: add in sas transport class") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221111043012.1074466-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: hpsa: Fix error handling in hpsa_add_sas_host()	Yang Yingliang
	hpsa_sas_port_add_phy() does: ... sas_phy_add() -> may return error here sas_port_add_phy() ... Whereas hpsa_free_sas_phy() does: ... sas_port_delete_phy() sas_phy_delete() ... If hpsa_sas_port_add_phy() returns an error, hpsa_free_sas_phy() can not be called to free the memory because the port and the phy have not been added yet. Replace hpsa_free_sas_phy() with sas_phy_free() and kfree() to avoid kernel crash in this case. Fixes: d04e62b9d63a ("hpsa: add in sas transport class") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221110151129.394389-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-25	scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add()	Yang Yingliang
	In mpt3sas_transport_port_add(), if sas_rphy_add() returns error, sas_rphy_free() needs be called to free the resource allocated in sas_end_device_alloc(). Otherwise a kernel crash will happen: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 CPU: 45 PID: 37020 Comm: bash Kdump: loaded Tainted: G W 6.1.0-rc1+ #189 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x54/0x3d0 lr : device_del+0x37c/0x3d0 Call trace: device_del+0x54/0x3d0 attribute_container_class_device_del+0x28/0x38 transport_remove_classdev+0x6c/0x80 attribute_container_device_trigger+0x108/0x110 transport_remove_device+0x28/0x38 sas_rphy_remove+0x50/0x78 [scsi_transport_sas] sas_port_delete+0x30/0x148 [scsi_transport_sas] do_sas_phy_delete+0x78/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x30/0x50 [scsi_transport_sas] sas_rphy_remove+0x38/0x78 [scsi_transport_sas] sas_port_delete+0x30/0x148 [scsi_transport_sas] do_sas_phy_delete+0x78/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x30/0x50 [scsi_transport_sas] sas_remove_host+0x20/0x38 [scsi_transport_sas] scsih_remove+0xd8/0x420 [mpt3sas] Because transport_add_device() is not called when sas_rphy_add() fails, the device is not added. When sas_rphy_remove() is subsequently called to remove the device in the remove() path, a NULL pointer dereference happens. Fixes: f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221109032403.1636422-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-24	scsi: hpsa: Fix possible memory leak in hpsa_init_one()	Yuan Can
	The hpda_alloc_ctlr_info() allocates h and its field reply_map. However, in hpsa_init_one(), if alloc_percpu() failed, the hpsa_init_one() jumps to clean1 directly, which frees h and leaks the h->reply_map. Fix by calling hpda_free_ctlr_info() to release h->replay_map and h instead free h directly. Fixes: 8b834bff1b73 ("scsi: hpsa: fix selection of reply queue") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221122015751.87284-1-yuancan@huawei.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-24	scsi: core: Do not increase scsi_device's iorequest_cnt if dispatch failed	Wenchao Hao
	If scsi_dispatch_cmd() failed, the SCSI command was not sent to the target. scsi_queue_rq() would return BLK_STS_RESOURCE if scsi_dispatch_cmd() failed, and the related request would be requeued. The timeout of this request would not fire, so noone would increase iodone_cnt. Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221123122137.150776-3-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-24	scsi: core: Increase scsi_device's iodone_cnt in scsi_timeout()	Wenchao Hao
	If a SCSI command times out and is going to be aborted, we should increase the iodone_cnt of the related scsi_device. Otherwise the iodone_cnt would be smaller than iorequest_cnt. Increasing iodone_cnt in scsi_timeout() would not cause a double accounting issue. Brief analysis follows: - We add the iodone_cnt when BLK_EH_DONE is returned in scsi_timeout(). The related command's timeout event would not happen. - If the abort succeeds and the command is not retried, the command would be completed with scsi_finish_command() which would not increase iodone_cnt. - If the abort succeeds and the command is retried, it would be requeue. A scsi_dispatch_cmd() would be called and iorequest_cnt would be increased again. - If the abort fails, the error handler successfully recovers the device, and the command is not retried, the command would be completed with scsi_finish_command() which would not increase iodone_cnt. - If the abort fails, the error handler successfully recovers the device, and the command is retried, the iorequest_cnt would be increased again. Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221123122137.150776-2-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-24	scsi: iscsi: Rename iscsi_set_param() to iscsi_if_set_param()	Wenchao Hao
	There are two iscsi_set_param() functions defined in libiscsi.c and scsi_transport_iscsi.c respectively which is confusing. Rename the one in scsi_transport_iscsi.c to iscsi_if_set_param(). Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221122181105.4123935-1-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-24	scsi: target: core: Fix hard lockup when executing a compare-and-write command	Maurizio Lombardi
	While handling an I/O completion for the compare portion of a COMPARE_AND_WRITE command, it may happen that the compare_and_write_callback function submits new bio structs while still in softirq context. Low level drivers like md raid5 do not expect their make_request call to be used in softirq context, they call into schedule() and create a deadlocked system. __schedule at ffffffff873a0807 schedule at ffffffff873a0cc5 raid5_get_active_stripe at ffffffffc0875744 [raid456] raid5_make_request at ffffffffc0875a50 [raid456] md_handle_request at ffffffff8713b9f9 md_make_request at ffffffff8713bacb generic_make_request at ffffffff86e6f14b submit_bio at ffffffff86e6f27c iblock_submit_bios at ffffffffc0b4e4dc [target_core_iblock] iblock_execute_rw at ffffffffc0b4f3ce [target_core_iblock] __target_execute_cmd at ffffffffc1090079 [target_core_mod] compare_and_write_callback at ffffffffc1093602 [target_core_mod] target_cmd_interrupted at ffffffffc108d1ec [target_core_mod] target_complete_cmd_with_sense at ffffffffc108d27c [target_core_mod] iblock_complete_cmd at ffffffffc0b4e23a [target_core_iblock] dm_io_dec_pending at ffffffffc00db29e [dm_mod] clone_endio at ffffffffc00dbf07 [dm_mod] raid5_align_endio at ffffffffc086d6c2 [raid456] blk_update_request at ffffffff86e6d950 scsi_end_request at ffffffff87063d48 scsi_io_completion at ffffffff87063ee8 blk_complete_reqs at ffffffff86e77b05 __softirqentry_text_start at ffffffff876000d7 This problem appears to be an issue between target_cmd_interrupted() and compare_and_write_callback(). target_cmd_interrupted() calls the se_cmd's transport_complete_callback function pointer if the se_cmd is being stopped or aborted, and CMD_T_ABORTED was set on the se_cmd. When calling compare_and_write_callback(), the success parameter was set to false. target_cmd_interrupted() seems to expect this means the callback will do cleanup that does not require a process context. But compare_and_write_callback() ignores the parameter if there was I/O done for the compare part of COMPARE_AND_WRITE. Since there was data, the function continued on, passed the compare, and issued a write while ignoring the value of the success parameter. The submit of a bio for the write portion of the COMPARE_AND_WRITE then causes schedule to be unsafely called from the softirq context. Fix the bug in compare_and_write_callback by jumping to the out label if success == "false", after checking if we have been called by transport_generic_request_failure(); The command is being aborted or stopped so there is no need to submit the write bio for the write part of the COMPARE_AND_WRITE command. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20221121092703.316489-1-mlombard@redhat.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>