lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2024-06-11	dmaengine: idxd: Fix possible Use-After-Free in irq_process_work_list	Li RongQing
	Use list_for_each_entry_safe() to allow iterating through the list and deleting the entry in the iteration process. The descriptor is freed via idxd_desc_complete() and there's a slight chance may cause issue for the list iterator when the descriptor is reused by another thread without it being deleted from the list. Fixes: 16e19e11228b ("dmaengine: idxd: Fix list corruption in description completion") Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20240603012444.11902-1-lirongqing@baidu.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-04-07	dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue	Rex Zhang
	drain_workqueue() cannot be called safely in a spinlocked context due to possible task rescheduling. In the multi-task scenario, calling queue_work() while drain_workqueue() will lead to a Call Trace as pushing a work on a draining workqueue is not permitted in spinlocked context. Call Trace: <TASK> ? __warn+0x7d/0x140 ? __queue_work+0x2b2/0x440 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __queue_work+0x2b2/0x440 queue_work_on+0x28/0x30 idxd_misc_thread+0x303/0x5a0 [idxd] ? __schedule+0x369/0xb40 ? __pfx_irq_thread_fn+0x10/0x10 ? irq_thread+0xbc/0x1b0 irq_thread_fn+0x21/0x70 irq_thread+0x102/0x1b0 ? preempt_count_add+0x74/0xa0 ? __pfx_irq_thread_dtor+0x10/0x10 ? __pfx_irq_thread+0x10/0x10 kthread+0x103/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> The current implementation uses a spinlock to protect event log workqueue and will lead to the Call Trace due to potential task rescheduling. To address the locking issue, convert the spinlock to mutex, allowing the drain_workqueue() to be called in a safe mutex-locked context. This change ensures proper synchronization when accessing the event log workqueue, preventing potential Call Trace and improving the overall robustness of the code. Fixes: c40bd7d9737b ("dmaengine: idxd: process user page faults for completion record") Signed-off-by: Rex Zhang <rex.zhang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Lijun Pan <lijun.pan@intel.com> Link: https://lore.kernel.org/r/20240404223949.2885604-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-02-16	dmaengine: idxd: Remove shadow Event Log head stored in idxd	Fenghua Yu
	head is defined in idxd->evl as a shadow of head in the EVLSTATUS register. There are two issues related to the shadow head: 1. Mismatch between the shadow head and the state of the EVLSTATUS register: If Event Log is supported, upon completion of the Enable Device command, the Event Log head in the variable idxd->evl->head should be cleared to match the state of the EVLSTATUS register. But the variable is not reset currently, leading mismatch between the variable and the register state. The mismatch causes incorrect processing of Event Log entries. 2. Unnecessary shadow head definition: The shadow head is unnecessary as head can be read directly from the EVLSTATUS register. Reading head from the register incurs no additional cost because event log head and tail are always read together and tail is already read directly from the register as required by hardware. Remove the shadow Event Log head stored in idxd->evl to address the mentioned issues. Fixes: 244da66cda35 ("dmaengine: idxd: setup event log configuration") Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240215024931.1739621-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-12-15	dmaengine: idxd: add callback support for iaa crypto	Tom Zanussi
	Create a lightweight callback interface to allow idxd sub-drivers to be notified when work sent to idxd wqs has completed. For a sub-driver to be notified of work completion, it needs to: - Set the descriptor's 'Request Completion Interrupt' (IDXD_OP_FLAG_RCI) - Set the sub-driver desc_complete() callback when registering the sub-driver e.g.: struct idxd_device_driver my_drv = { .probe = my_probe, .desc_complete = my_complete, } - Set the sub-driver-specific context in the sub-driver's descriptor e.g: idxd_desc->crypto.req = req; idxd_desc->crypto.tfm = tfm; idxd_desc->crypto.src_addr = src_addr; idxd_desc->crypto.dst_addr = dst_addr; When the work completes and the completion irq fires, idxd will invoke the desc_complete() callback with pointers to the descriptor, context, and completion_type. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-09-28	dmaengine: idxd: rate limit printk in misc interrupt thread	Dave Jiang
	Add rate limit to the dev_warn() call in the misc interrupt thread. This limits dmesg getting spammed if a descriptor submitter is spamming bad descriptors with invalid completion records and resulting the errors being continuously reported by the misc interrupt handling thread. Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Lijun Pan <lijun.pan@intel.com> Link: https://lore.kernel.org/r/20230924002347.1117757-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-05-03	Merge tag 'dmaengine-6.4-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New support: - Apple admac t8112 device support - StarFive JH7110 DMA controller Updates: - Big pile of idxd updates to support IAA 2.0 device capabilities, DSA 2.0 Event Log and completion record faulting features and new DSA operations - at_xdmac supend & resume updates and driver code cleanup - k3-udma supend & resume support - k3-psil thread support for J784s4" * tag 'dmaengine-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (57 commits) dmaengine: idxd: add per wq PRS disable dmaengine: idxd: add pid to exported sysfs attribute for opened file dmaengine: idxd: expose fault counters to sysfs dmaengine: idxd: add a device to represent the file opened dmaengine: idxd: add per file user counters for completion record faults dmaengine: idxd: process batch descriptor completion record faults dmaengine: idxd: add descs_completed field for completion record dmaengine: idxd: process user page faults for completion record dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling dmaengine: idxd: create kmem cache for event log fault items dmaengine: idxd: add per DSA wq workqueue for processing cr faults dmanegine: idxd: add debugfs for event log dump dmaengine: idxd: add interrupt handling for event log dmaengine: idxd: setup event log configuration dmaengine: idxd: add event log size sysfs attribute dmaengine: idxd: make misc interrupt one shot dt-bindings: dma: snps,dw-axi-dmac: constrain the items of resets for JH7110 dma dt-bindings: dma: Drop unneeded quotes dmaengine: at_xdmac: align declaration of ret with the rest of variables dmaengine: at_xdmac: add a warning message regarding for unpaused channels ...
2023-04-12	dmaengine: idxd: add per file user counters for completion record faults	Dave Jiang
	Add counters per opened file for the char device in order to keep track how many completion record faults occurred and how many of those faults failed the writeback by the driver after attempt to fault in the page. The counters are managed by xarray that associates the PASID with struct idxd_user_context. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-13-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: process batch descriptor completion record faults	Dave Jiang
	Add event log processing for faulting of user batch descriptor completion record. When encountering an event log entry for a page fault on a completion record, the driver is expected to do the following: 1. If the "first error in batch" bit in event log entry error info is set, discard any previously recorded errors associated with the "batch identifier". 2. Fix the page fault according to the fault address in the event log. If successful, write the completion record to the fault address in user space. 3. If an error is encountered while writing the completion record and it is associated to a descriptor in the batch, the driver associates the error with the batch identifier of the event log entry and tracks it until the event log entry for the corresponding batch desc is encountered. While processing an event log entry for a batch descriptor with error indicating that one or more descs in the batch had event log entries, the driver will do the following before writing the batch completion record: 1. If the status field of the completion record is 0x1, the driver will change it to error code 0x5 (one or more operations in batch completed with status not successful) and changes the result field to 1. 2. If the status is error code 0x6 (page fault on batch descriptor list address), change the result field to 1. 3. If status is any other value, the completion record is not changed. 4. Clear the recorded error in preparation for next batch with same batch identifier. The result field is for user software to determine whether to set the "Batch Error" flag bit in the descriptor for continuation of partial batch descriptor completion. See DSA spec 2.0 for additional information. If no error has been recorded for the batch, the batch completion record is written to user space as is. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: process user page faults for completion record	Dave Jiang
	DSA supports page fault handling through PRS. However, the DMA engine that's processing the descriptor is blocked until the PRS response is received. Other workqueues sharing the engine are also blocked. Page fault handing by the driver with PRS disabled can be used to mitigate the stalling. With PRS disabled while ATS remain enabled, DSA handles page faults on a completion record by reporting an event in the event log. In this instance, the descriptor is completed and the event log contains the completion record address and the contents of the completion record. Add support to the event log handling code to fault in the completion record and copy the content of the completion record to user memory. A bitmap is introduced to keep track of discarded event log entries. When the user process initiates ->release() of the char device, it no longer is interested in any remaining event log entries tied to the relevant wq and PASID. The driver will mark the event log entry index in the bitmap. Upon encountering the entries during processing, the event log handler will just clear the bitmap bit and skip the entry rather than attempt to process the event log entry. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: add interrupt handling for event log	Dave Jiang
	An event log interrupt is raised in the misc interrupt INTCAUSE register when an event is written by the hardware. Add basic event log processing support to the interrupt handler. The event log is a ring where the hardware owns the tail and the software owns the head. The hardware will advance the tail index when an additional event has been pushed to memory. The software will process the log entry and then advances the head. The log is full when (tail + 1) % log_size = head. The hardware will stop writing when the log is full. The user is expected to create a log size large enough to handle all the expected events. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-5-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: make misc interrupt one shot	Dave Jiang
	Current code continuously processes the interrupt as long as the hardware is setting the status bit. There's no reason to do that since the threaded handler will get called again if another interrupt is asserted. Also through testing, it has shown that if a misprogrammed (or malicious) agent can continuously submit descriptors with bad completion record and causes errors to be reported via the misc interrupt. Continuous processing by the thread can cause software hang watchdog to kick off since the thread isn't giving up the CPU. Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-2-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-03-31	iommu/ioasid: Rename INVALID_IOASID	Jacob Pan
	INVALID_IOASID and IOMMU_PASID_INVALID are duplicated. Rename INVALID_IOASID and consolidate since we are moving away from IOASID infrastructure. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230322200803.869130-7-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-29	dmaengine: idxd: Remove unused struct idxd_fault	Yuan Can
	Since fault processing code has been removed, struct idxd_fault is not used any more and can be removed as well. Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20220928014747.106808-1-yuancan@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: idxd: track enabled workqueues in bitmap	Jerry Snitselaar
	Now that idxd_wq_disable_cleanup() sets the workqueue state to IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been enabled. This will then be used to determine which workqueues should be re-enabled when attempting a software reset to recover from a device halt state. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-04	dmaengine: idxd: avoid deadlock in process_misc_interrupts()	Jerry Snitselaar
	idxd_device_clear_state() now grabs the idxd->dev_lock itself, so don't grab the lock prior to calling it. This was seen in testing after dmar fault occurred on system, resulting in lockup stack traces. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: dmaengine@vger.kernel.org Fixes: cf4ac3fef338 ("dmaengine: idxd: fix lockdep warning on device driver removal") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220823163709.2102468-1-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05	dmaengine: idxd: change MSIX allocation based on per wq activation	Dave Jiang
	Change the driver where WQ interrupt is requested only when wq is being enabled. This new scheme set things up so that request_threaded_irq() is only called when a kernel wq type is being enabled. This also sets up for future interrupt request where different interrupt handler such as wq occupancy interrupt can be setup instead of the wq completion interrupt. Not calling request_irq() until the WQ actually needs an irq also prevents wasting of CPU irq vectors on x86 systems, which is a limited resource. idxd_flush_pending_descs() is moved to device.c since descriptor flushing is now part of wq disable rather than shutdown(). Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05	dmaengine: idxd: embed irq_entry in idxd_wq struct	Dave Jiang
	With irq_entry already being associated with the wq in a 1:1 relationship, embed the irq_entry in the idxd_wq struct and remove back pointers for idxe_wq and idxd_device. In the process of this work, clean up the interrupt handle assignment so that there's no decision to be made during submit call on where interrupt handle value comes from. Set the interrupt handle during irq request initialization time. irq_entry 0 is designated as special and is tied to the device itself. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05	Merge branch 'fixes' into next	Vinod Koul
	We have a conflict in idxd driver between 'fixes' and 'next' and there are patches dependent on this so, merge the 'fixes' branch into next
2021-12-17	dmaengine: idxd: add knob for enqcmds retries	Dave Jiang
	Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS descriptor submission. While on host, it is not as likely that ENQCMDS return busy during normal operations due to the driver controlling the number of descriptors allocated for submission. However, when the driver is operating as a guest driver, the chance of retry goes up significantly due to sharing a wq with multiple VMs. A default value is provided with the system admin being able to tune the value on a per WQ basis. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: fix calling wq quiesce inside spinlock	Dave Jiang
	Dan reports that smatch has found idxd_wq_quiesce() is being called inside the idxd->dev_lock. idxd_wq_quiesce() calls wait_for_completion() and therefore it can sleep. Move the call outside of the spinlock as it does not need device lock. Fixes: 5b0c68c473a1 ("dmaengine: idxd: support reporting of halt interrupt") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163716858508.1721911.15051495873516709923.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: handle interrupt handle revoked event	Dave Jiang
	"Interrupt handle revoked" is an event that happens when the driver is running on a guest kernel and the VM is migrated to a new machine. The device will trigger an interrupt that signals to the guest driver that the interrupt handles need to be replaced. The misc irq thread function calls a helper function to handle the event. The function uses the WQ percpu_ref to quiesce the kernel submissions. It then replaces the interrupt handles by requesting interrupt handle command for each I/O MSIX vector. Once the handle is updated, the driver will unblock the submission path to allow new submissions. The submitter will attempt to acquire a percpu_ref before submission. When the request fails, it will wait on the wq_resurrect 'completion'. The driver does anticipate the possibility of descriptors being submitted before the WQ percpu_ref is killed. If a descriptor has already been submitted, it will return with incorrect interrupt handle status. The descriptor will be re-submitted with the new interrupt handle on the completion path. For descriptors with incorrect interrupt handles, completion interrupt won't be triggered. At the completion of the interrupt handle refresh, the handling function will call idxd_int_handle_refresh_drain() to issue drain descriptors to each of the wq with associated interrupt handle. The drain descriptor will have interrupt request set but without completion record. This will ensure all descriptors with incorrect interrupt completion handle get drained and a completion interrupt is triggered for the guest driver to process them. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: handle invalid interrupt handle descriptors	Dave Jiang
	Handle a descriptor that has been marked with invalid interrupt handle error in status. Create a work item that will resubmit the descriptor. This typically happens when the driver has handled the revoke interrupt handle event and has a new interrupt handle. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528419601.3925689.4166517602890523193.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: add helper for per interrupt handle drain	Dave Jiang
	The helper is called at the completion of the interrupt handle refresh event. It issues drain descriptors to each of the wq with associated interrupt handle. The drain descriptor will have interrupt request set but without completion record. This will ensure all descriptors with incorrect interrupt completion handle get drained and a completion interrupt is triggered for the guest driver to process them. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528418315.3925689.7944718440052849626.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: rework descriptor free path on failure	Dave Jiang
	Refactor the completion function to allow skipping of descriptor freeing on the submission failure path. This completely removes descriptor freeing from the submit failure path and leave the responsibility to the caller. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528416222.3925689.12859769271667814762.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-25	dmaengine: idxd: add halt interrupt support	Dave Jiang
	Add halt interrupt support. Given that the misc interrupt handler already check halt state, the driver just need to run the halt handling code when receiving the halt interrupt. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163114224352.846654.14334468363464318828.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-25	dmaengine: idxd: Use list_move_tail instead of list_del/list_add_tail	Bixuan Cui
	Using list_move_tail() instead of list_del() + list_add_tail() Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20210908092826.67765-1-cuibixuan@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-29	dmaengine: idxd: remove interrupt disable for dev_lock	Dave Jiang
	The spinlock is not being used in hard interrupt context. There is no need to disable irq when acquiring the lock. The interrupt thread handler also is not in bottom half context, therefore we can also remove disabling of the bh. Convert all dev_lock acquisition to plain spin_lock() calls. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162984026772.1939166.11504067782824765879.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-06	dmaengine: idxd: remove interrupt flag for completion list spinlock	Dave Jiang
	The list lock is never acquired in interrupt context. Therefore there is no need to disable interrupts. Remove interrupt flags for lock operations. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162826417450.3454650.3733188117742416238.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-06	dmaengine: idxd: make I/O interrupt handler one shot	Dave Jiang
	The interrupt thread handler currently loops forever to process outstanding completions. This causes either an "irq X: nobody cared" kernel splat or the NMI watchdog kicks in due to running too long in the function. The irq thread handler is expected to run again after exiting if there are interrupts fired while the thread handler is running. So the handler code can process all the completed I/O in a single pass and exit without losing the follow on completed I/O. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162802977005.3084234.11836261157026497585.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-06	dmaengine: idxd: Remove unused status variable in irq_process_work_list()	Nathan Chancellor
	status is no longer used within this block: drivers/dma/idxd/irq.c:255:6: warning: unused variable 'status' [-Wunused-variable] u8 status = desc->completion->status & DSA_COMP_STATUS_MASK; ^ 1 warning generated. Fixes: b60bb6e2bfc1 ("dmaengine: idxd: fix abort status check") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20210802175820.3153920-1-nathan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-29	dmaengine: idxd: fix abort status check	Dave Jiang
	Coverity static analysis of linux-next found issue. The check (status == IDXD_COMP_DESC_ABORT) is always false since status was previously masked with 0x7f and IDXD_COMP_DESC_ABORT is 0xff. Fixes: 6b4b87f2c31a ("dmaengine: idxd: fix submission race window") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162698465160.3560828.18173186265683415384.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21	dmaengine: idxd: remove fault processing code	Dave Jiang
	Kernel memory are pinned and will not cause faults. Since the driver does not support interrupts for user descriptors, no fault errors are expected to come through the misc interrupt. Remove dead code. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162630502789.631986.10591230961790023856.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21	dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_dev	Dave Jiang
	Add a 'struct idxd_dev' that wraps the 'struct device' for idxd conf_dev that registers with the dsa bus. This is introduced in order to deal with multiple different types of 'devices' that are registered on the dsa_bus when the compat driver needs to route them to the correct driver to attach. The bind() call now can determine the type of device and then do the appropriate driver matching. Reviewed-by Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637460065.744545.584492831446090984.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21	Merge branch 'fixes' into next	Vinod Koul
	Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-20	dmaengine: idxd: fix submission race window	Dave Jiang
	Konstantin observed that when descriptors are submitted, the descriptor is added to the pending list after the submission. This creates a race window with the slight possibility that the descriptor can complete before it gets added to the pending list and this window would cause the completion handler to miss processing the descriptor. To address the issue, the addition of the descriptor to the pending list must be done before it gets submitted to the hardware. However, submitting to swq with ENQCMDS instruction can cause a failure with the condition of either wq is full or wq is not "active". With the descriptor allocation being the gate to the wq capacity, it is not possible to hit a retry with ENQCMDS submission to the swq. The only possible failure can happen is when wq is no longer "active" due to hw error and therefore we are moving towards taking down the portal. Given this is a rare condition and there's no longer concern over I/O performance, the driver can walk the completion lists in order to retrieve and abort the descriptor. The error path will set the descriptor to aborted status. It will take the work list lock to prevent further processing of worklist. It will do a delete_all on the pending llist to retrieve all descriptors on the pending llist. The delete_all action does not require a lock. It will walk through the acquired llist to find the aborted descriptor while add all remaining descriptors to the work list since it holds the lock. If it does not find the aborted descriptor on the llist, it will walk through the work list. And if it still does not find the descriptor, then it means the interrupt handler has removed the desc from the llist but is pending on the work list lock and will process it once the error path releases the lock. Fixes: eb15e7154fbf ("dmaengine: idxd: add interrupt handle request and release support") Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162628855747.360485.10101925573082466530.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-14	dmanegine: idxd: cleanup all device related bits after disabling device	Dave Jiang
	The previous state cleanup patch only performed wq state cleanups. This does not go far enough as when device is disabled or reset, the state for groups and engines must also be cleaned up. Add additional state cleanup beyond wq cleanup. Tie those cleanups directly to device disable and reset, and wq disable and reset. Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162285154108.2096632.5572805472362321307.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-25	dmaengine: idxd: Enable IDXD performance monitor support	Tom Zanussi
	Add the code needed in the main IDXD driver to interface with the IDXD perfmon implementation. [ Based on work originally by Jing Lin. ] Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Link: https://lore.kernel.org/r/a5564a5583911565d31c2af9234218c5166c4b2c.1619276133.git.zanussi@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-23	dmaengine: idxd: remove MSIX masking for interrupt handlers	Dave Jiang
	Remove interrupt masking and just let the hard irq handler keep firing for new events. This is less of a performance impact vs the MMIO readback inside the pci_msi_{mask,unmas}_irq(). Especially with a loaded system those flushes can be stuck behind large amounts of MMIO writes to flush. When guest kernel is running on top of VFIO mdev, mask/unmask causes a vmexit each time and is not desirable. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/161894523436.3210025.1834640110556139277.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-23	dmaengine: idxd: support reporting of halt interrupt	Dave Jiang
	Unmask the halt error interrupt so it gets reported to the interrupt handler. When halt state interrupt is received, quiesce the kernel WQs and unmap the portals to stop submission. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161894441167.3202472.9485946398140619501.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-20	dmaengine: idxd: fix cdev setup and free device lifetime issues	Dave Jiang
	The char device setup and cleanup has device lifetime issues regarding when parts are initialized and cleaned up. The initialization of struct device is done incorrectly. device_initialize() needs to be called on the 'struct device' and then additional changes can be added. The ->release() function needs to be setup via device_type before dev_set_name() to allow proper cleanup. The change re-parents the cdev under the wq->conf_dev to get natural reference inheritance. No known dependency on the old device path exists. Reported-by: Jason Gunthorpe <jgg@nvidia.com> Fixes: 42d279f9137a ("dmaengine: idxd: add char driver to expose submission portal to userland") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/161852987721.2203940.1478218825576630810.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-20	dmaengine: idxd: fix wq conf_dev 'struct device' lifetime	Dave Jiang
	Remove devm_* allocation and fix wq->conf_dev 'struct device' lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE. Add release functions in order to free the allocated memory for the wq context at device destruction time. Reported-by: Jason Gunthorpe <jgg@nvidia.com> Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161852985907.2203940.6840120734115043753.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-12	dmaengine: idxd: Fix clobbering of SWERR overflow bit on writeback	Dave Jiang
	Current code blindly writes over the SWERR and the OVERFLOW bits. Write back the bits actually read instead so the driver avoids clobbering the OVERFLOW bit that comes after the register is read. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161352082229.3511254.1002151220537623503.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-01-17	dmaengine: idxd: fix misc interrupt completion	Dave Jiang
	Nikhil reported the misc interrupt handler can sometimes miss handling the command interrupt when an error interrupt happens near the same time. Have the irq handling thread continue to process the misc interrupts until all interrupts are processed. This is a low usage interrupt and is not expected to handle high volume traffic. Therefore there is no concern of this thread running for a long time. Fixes: 0d5c10b4c84d ("dmaengine: idxd: add work queue drain support") Reported-by: Nikhil Rao <nikhil.rao@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161074755329.2183844.13295528344116907983.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-01-17	dmaengine: idxd: Fix list corruption in description completion	Dave Jiang
	Sanjay reported the following kernel splat after running dmatest for stress testing. The current code is giving up the spinlock in the middle of a completion list walk, and that opens up opportunity for list corruption if another thread touches the list at the same time. In order to make sure the list is always protected, the hardware completed descriptors will be put on a local list to be completed with callbacks from the outside of the list lock. kernel: general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI kernel: CPU: 62 PID: 1814 Comm: irq/89-idxd-por Tainted: G W 5.10.0-intel-next_10_16+ #1 kernel: Hardware name: Intel Corporation ArcherCity/ArcherCity, BIOS EGSDCRB1.SBT.4915.D02.2012070418 12/07/2020 kernel: RIP: 0010:irq_process_work_list+0xcd/0x170 [idxd] kernel: Code: e8 18 65 5c d3 8b 45 d4 85 c0 75 b3 4c 89 f7 e8 b9 fe ff ff 84 c0 74 bf 4c 89 e7 e8 dd 6b 5c d3 49 8b 3f 49 8b 4f 08 48 89 c6 <48> 89 4f 08 48 89 39 4c 89 e7 48 b9 00 01 00 00 00 00 ad de 49 89 kernel: RSP: 0018:ff256768c4353df8 EFLAGS: 00010046 kernel: RAX: 0000000000000202 RBX: dead000000000100 RCX: dead000000000122 kernel: RDX: 0000000000000001 RSI: 0000000000000202 RDI: dead000000000100 kernel: RBP: ff256768c4353e40 R08: 0000000000000001 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 00000000ffffffff R12: ff1fdf9fd06b3e48 kernel: R13: 0000000000000005 R14: ff1fdf9fc4275980 R15: ff1fdf9fc4275a00 kernel: FS: 0000000000000000(0000) GS:ff1fdfa32f880000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007f87f24012a0 CR3: 000000010f630004 CR4: 0000000003771ee0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 kernel: PKRU: 55555554 kernel: Call Trace: kernel: ? irq_thread+0xa9/0x1b0 kernel: idxd_wq_thread+0x34/0x90 [idxd] kernel: irq_thread_fn+0x24/0x60 kernel: irq_thread+0x10f/0x1b0 kernel: ? irq_forced_thread_fn+0x80/0x80 kernel: ? wake_threads_waitq+0x30/0x30 kernel: ? irq_thread_check_affinity+0xe0/0xe0 kernel: kthread+0x142/0x160 kernel: ? kthread_park+0x90/0x90 kernel: ret_from_fork+0x1f/0x30 kernel: Modules linked in: idxd_ktest dmatest intel_rapl_msr idxd_mdev iTCO_wdt vfio_pci vfio_virqfd iTCO_vendor_support intel_rapl_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl msr pcspkr snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg ofpart snd_hda_codec snd_hda_core snd_hwdep cmdlinepart snd_seq snd_seq_device idxd snd_pcm intel_spi_pci intel_spi snd_timer spi_nor input_leds joydev snd i2c_i801 mtd soundcore i2c_smbus mei_me mei i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sunrpc nls_iso8859_1 sch_fq_codel ip_tables x_tables xfs libcrc32c ast drm_vram_helper drm_ttm_helper ttm igc wmi pinctrl_sunrisepoint hid_generic usbmouse usbkbd usbhid hid kernel: ---[ end trace cd5ca950ef0db25f ]--- Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Fixes: e4f4d8cdeb9a ("dmaengine: idxd: Clean up descriptors with fault error") Link: https://lore.kernel.org/r/161074757267.2183951.17912830858060607266.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-10-30	dmaengine: idxd: Clean up descriptors with fault error	Dave Jiang
	Add code to "complete" a descriptor when the descriptor or its completion address hit a fault error when SVA mode is being used. This error can be triggered due to bad programming by the user. A lock is introduced in order to protect the descriptor completion lists since the fault handler will run from the system work queue after being scheduled in the interrupt handler. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/160382008092.3911367.12766483427643278985.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-09-11	Merge tag 'v5.9-rc4' into next	Vinod Koul
	Linux 5.9-rc4
2020-08-17	dmaengine: idxd: clear misc interrupt cause after read	Dave Jiang
	Move the clearing of misc interrupt cause to immediately after reading the register in order to allow the next interrupt to be asserted. Suggested-by: Nikhil Rao <nikhil.rao@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/159603824665.28647.5344356370364397996.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-08-17	dmaengine: idxd: reset states after device disable or reset	Dave Jiang
	The state for WQs should be reset to disabled when a device is reset or disabled. Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling") Reported-by: Mona Hossain <mona.hossain@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/159586777684.27150.17589406415773568534.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-08-05	Merge branch 'for-linus' into fixes	Vinod Koul
	Signed-off-by: Vinod Koul <vkoul@kernel.org> Conflicts: drivers/dma/idxd/sysfs.c
2020-07-13	dmaengine: idxd: move idxd interrupt handling to mask instead of ignore	Dave Jiang
	Switch driver to use MSIX mask and unmask instead of the ignore bit. When ignore bit is cleared, we must issue an MMIO read to ensure writes have all arrived and check and process any additional completions. The ignore bit does not queue up any pending MSIX interrupts. The mask bit however does. Use API call from interrupt subsystem to mask MSIX interrupt since the hardware does not have convenient mask bit register. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/159319517621.70410.11816465052708900506.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>