lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2024-09-10	dmaengine: idxd: Clean up cpumask and hotplug for perfmon	Kan Liang
	The idxd PMU is system-wide scope, which is supported by the generic perf_event subsystem now. Set the scope for the idxd PMU and remove all the cpumask and hotplug codes. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20240802151643.1691631-6-kan.liang@linux.intel.com
2024-05-13	dmaengine: idxd: add a new security check to deal with a hardware erratum	Arjan van de Ven
	On Sapphire Rapids and related platforms, the DSA and IAA devices have an erratum that causes direct access (for example, by using the ENQCMD or MOVDIR64 instructions) from untrusted applications to be a security problem. To solve this, add a flag to the PCI device enumeration and device structures to indicate the presence/absence of this security exposure. In the mmap() method of the device, this flag is then used to enforce that the user has the CAP_SYS_RAWIO capability. In a future patch, a write() based method will be added that allows untrusted applications submit work to the accelerator, where the kernel can do sanity checking on the user input to ensure secure operation of the accelerator. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2024-04-07	dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue	Rex Zhang
	drain_workqueue() cannot be called safely in a spinlocked context due to possible task rescheduling. In the multi-task scenario, calling queue_work() while drain_workqueue() will lead to a Call Trace as pushing a work on a draining workqueue is not permitted in spinlocked context. Call Trace: <TASK> ? __warn+0x7d/0x140 ? __queue_work+0x2b2/0x440 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __queue_work+0x2b2/0x440 queue_work_on+0x28/0x30 idxd_misc_thread+0x303/0x5a0 [idxd] ? __schedule+0x369/0xb40 ? __pfx_irq_thread_fn+0x10/0x10 ? irq_thread+0xbc/0x1b0 irq_thread_fn+0x21/0x70 irq_thread+0x102/0x1b0 ? preempt_count_add+0x74/0xa0 ? __pfx_irq_thread_dtor+0x10/0x10 ? __pfx_irq_thread+0x10/0x10 kthread+0x103/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> The current implementation uses a spinlock to protect event log workqueue and will lead to the Call Trace due to potential task rescheduling. To address the locking issue, convert the spinlock to mutex, allowing the drain_workqueue() to be called in a safe mutex-locked context. This change ensures proper synchronization when accessing the event log workqueue, preventing potential Call Trace and improving the overall robustness of the code. Fixes: c40bd7d9737b ("dmaengine: idxd: process user page faults for completion record") Signed-off-by: Rex Zhang <rex.zhang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Lijun Pan <lijun.pan@intel.com> Link: https://lore.kernel.org/r/20240404223949.2885604-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-03-15	Merge tag 'dmaengine-6.9-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New hardware support: - Allwinner H616 dma support - Renesas r8a779h0 dma controller support - TI CSI2RX dma support Updates: - Freescale edma driver updates for TCD64csupport for i.MX95 - constify of pointers and args - Yaml conversion for MediaTek High-Speed controller binding - TI k3 udma support for TX/RX DMA channels for thread IDs: * tag 'dmaengine-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (25 commits) dmaengine: of: constify of_phandle_args in of_dma_find_controller() dmaengine: pl08x: constify pointer to char in filter function MAINTAINERS: change in AMD ptdma maintainer MAINTAINERS: adjust file entry in MEDIATEK DMA DRIVER dmaengine: idxd: constify the struct device_type usage dt-bindings: renesas,rcar-dmac: Add r8a779h0 support dt-bindings: dma: convert MediaTek High-Speed controller to the json-schema dmaengine: idxd: make dsa_bus_type const dmaengine: fsl-edma: integrate TCD64 support for i.MX95 dt-bindings: fsl-dma: fsl-edma: add fsl,imx95-edma5 compatible string dmaengine: mcf-edma: utilize edma_write_tcdreg() macro for TCD Access dmaengine: fsl-edma: add address for channel mux register in fsl_edma_chan dmaengine: fsl-edma: fix spare build warning dmaengine: fsl-edma: involve help macro fsl_edma_set(get)_tcd() dt-bindings: mmp-dma: convert to YAML dmaengine: ti: k3-psil-j721s2: Add entry for CSI2RX dmaengine: ti: k3-udma-glue: Add function to request RX chan for thread ID dmaengine: ti: k3-udma-glue: Add function to request TX chan for thread ID dmaengine: ti: k3-udma-glue: Update name for remote RX channel device dmaengine: ti: k3-udma-glue: Add function to parse channel by ID ...
2024-02-22	dmaengine: idxd: constify the struct device_type usage	Ricardo B. Marliere
	Since commit aed65af1cc2f ("drivers: make device_type const"), the driver core can properly handle constant struct device_type. Move the dsa_device_type, iax_device_type, idxd_wq_device_type, idxd_cdev_file_type, idxd_cdev_device_type and idxd_group_device_type variables to be constant structures as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20240219-device_cleanup-dmaengine-v1-1-9f72f3cf3587@marliere.net Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-02-16	dmaengine: idxd: make dsa_bus_type const	Ricardo B. Marliere
	Since commit d492cc2573a0 ("driver core: device.h: make struct bus_type a const *"), the driver core can properly handle constant struct bus_type, move the dsa_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240213-bus_cleanup-idxd-v1-1-c3e703675387@marliere.net Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-02-16	dmaengine: idxd: Remove shadow Event Log head stored in idxd	Fenghua Yu
	head is defined in idxd->evl as a shadow of head in the EVLSTATUS register. There are two issues related to the shadow head: 1. Mismatch between the shadow head and the state of the EVLSTATUS register: If Event Log is supported, upon completion of the Enable Device command, the Event Log head in the variable idxd->evl->head should be cleared to match the state of the EVLSTATUS register. But the variable is not reset currently, leading mismatch between the variable and the register state. The mismatch causes incorrect processing of Event Log entries. 2. Unnecessary shadow head definition: The shadow head is unnecessary as head can be read directly from the EVLSTATUS register. Reading head from the register incurs no additional cost because event log head and tail are always read together and tail is already read directly from the register as required by hardware. Remove the shadow Event Log head stored in idxd->evl to address the mentioned issues. Fixes: 244da66cda35 ("dmaengine: idxd: setup event log configuration") Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240215024931.1739621-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-12-15	dmaengine: idxd: Add support for device/wq defaults	Tom Zanussi
	Add a load_device_defaults() function pointer to struct idxd_driver_data, which if defined, will be called when an idxd device is probed and will allow the idxd device to be configured with default values. The load_device_defaults() function is passed an idxd device to work with to set specific device attributes. Also add a load_device_defaults() implementation IAA devices; future patches would add default functions for other device types such as DSA. The way idxd device probing works, if the device configuration is valid at that point e.g. at least one workqueue and engine is properly configured then the device will be enabled and ready to go. The IAA implementation, idxd_load_iaa_device_defaults(), configures a single workqueue (wq0) for each device with the following default values: mode "dedicated" threshold 0 size Total WQ Size from WQCAP priority 10 type IDXD_WQT_KERNEL group 0 name "iaa_crypto" driver_name "crypto" Note that this now adds another configuration step for any users that want to configure their own devices/workqueus with something different in that they'll first need to disable (in the case of IAA) wq0 and the device itself before they can set their own attributes and re-enable, since they've been already been auto-enabled. Note also that in order for the new configuration to be applied to the deflate-iaa crypto algorithm the iaa_crypto module needs to unregister the old version, which is accomplished by removing the iaa_crypto module, and re-registering it with the new configuration by reinserting the iaa_crypto module. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-15	dmaengine: idxd: add callback support for iaa crypto	Tom Zanussi
	Create a lightweight callback interface to allow idxd sub-drivers to be notified when work sent to idxd wqs has completed. For a sub-driver to be notified of work completion, it needs to: - Set the descriptor's 'Request Completion Interrupt' (IDXD_OP_FLAG_RCI) - Set the sub-driver desc_complete() callback when registering the sub-driver e.g.: struct idxd_device_driver my_drv = { .probe = my_probe, .desc_complete = my_complete, } - Set the sub-driver-specific context in the sub-driver's descriptor e.g: idxd_desc->crypto.req = req; idxd_desc->crypto.tfm = tfm; idxd_desc->crypto.src_addr = src_addr; idxd_desc->crypto.dst_addr = dst_addr; When the work completes and the completion irq fires, idxd will invoke the desc_complete() callback with pointers to the descriptor, context, and completion_type. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-15	dmaengine: idxd: Add wq private data accessors	Tom Zanussi
	Add the accessors idxd_wq_set_private() and idxd_wq_get_private() allowing users to set and retrieve a private void * associated with an idxd_wq. The private data is stored in the idxd_dev.conf_dev associated with each idxd_wq. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-15	dmaengine: idxd: Rename drv_enable/disable_wq to idxd_drv_enable/disable_wq, ↵	Tom Zanussi
	and export Rename drv_enable_wq and drv_disable_wq to idxd_drv_enable_wq and idxd_drv_disable_wq respectively, so that they're no longer too generic to be exported. This also matches existing naming within the idxd driver. And to allow idxd sub-drivers to enable and disable wqs, export them. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-15	dmaengine: idxd: add external module driver support for dsa_bus_type	Dave Jiang
	Add support to allow an external driver to be registered to the dsa_bus_type and also auto-loaded. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-04	dmaengine: idxd: add wq driver name support for accel-config user tool	Dave Jiang
	With the possibility of multiple wq drivers that can be bound to the wq, the user config tool accel-config needs a way to know which wq driver to bind to the wq. Introduce per wq driver_name sysfs attribute where the user can indicate the driver to be bound to the wq. This allows accel-config to just bind to the driver using wq->driver_name. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20230908201045.4115614-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-09-03	Merge tag 'dmaengine-6.6-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New controller support and updates to drivers. New support: - Qualcomm SM6115 and QCM2290 dmaengine support - at_xdma support for microchip,sam9x7 controller Updates: - idxd updates for wq simplification and ats knob updates - fsl edma updates for v3 support - Xilinx AXI4-Stream control support - Yaml conversion for bcm dma binding" * tag 'dmaengine-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (53 commits) dmaengine: fsl-edma: integrate v3 support dt-bindings: fsl-dma: fsl-edma: add edma3 compatible string dmaengine: fsl-edma: move tcd into struct fsl_dma_chan dmaengine: fsl-edma: refactor chan_name setup and safety dmaengine: fsl-edma: move clearing of register interrupt into setup_irq function dmaengine: fsl-edma: refactor using devm_clk_get_enabled dmaengine: fsl-edma: simply ATTR_DSIZE and ATTR_SSIZE by using ffs() dmaengine: fsl-edma: move common IRQ handler to common.c dmaengine: fsl-edma: Remove enum edma_version dmaengine: fsl-edma: transition from bool fields to bitmask flags in drvdata dmaengine: fsl-edma: clean up EXPORT_SYMBOL_GPL in fsl-edma-common.c dmaengine: fsl-edma: fix build error when arch is s390 dmaengine: idxd: Fix issues with PRS disable sysfs knob dmaengine: idxd: Allow ATS disable update only for configurable devices dmaengine: xilinx_dma: Program interrupt delay timeout dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit dmaengine: xilinx_dma: Increase AXI DMA transaction segment count dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property ...
2023-08-21	dmaengine: idxd: Remove unused declarations	Yue Haibing
	Commit c05257b5600b ("dmanegine: idxd: open code the dsa_drv registration") removed idxd_{un}register_driver() definitions but not the declarations. Commit 034b3290ba25 ("dmaengine: idxd: create idxd_device sub-driver") declared idxd_{un}register_idxd_drv() but never implemented it. Commit 8f47d1a5e545 ("dmaengine: idxd: connect idxd to dmaengine subsystem") declared idxd_parse_completion_status() but never implemented it. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230817114135.50264-1-yuehaibing@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-09	dmaengine/idxd: Re-enable kernel workqueue under DMA API	Jacob Pan
	Kernel workqueues were disabled due to flawed use of kernel VA and SVA API. Now that we have the support for attaching PASID to the device's default domain and the ability to reserve global PASIDs from SVA APIs, we can re-enable the kernel work queues and use them under DMA API. We also use non-privileged access for in-kernel DMA to be consistent with the IOMMU settings. Consequently, interrupt for user privilege is enabled for work completion IRQs. Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/ Tested-by: Tony Zhu <tony.zhu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230802212427.1497170-9-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-03	Merge tag 'dmaengine-6.4-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New support: - Apple admac t8112 device support - StarFive JH7110 DMA controller Updates: - Big pile of idxd updates to support IAA 2.0 device capabilities, DSA 2.0 Event Log and completion record faulting features and new DSA operations - at_xdmac supend & resume updates and driver code cleanup - k3-udma supend & resume support - k3-psil thread support for J784s4" * tag 'dmaengine-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (57 commits) dmaengine: idxd: add per wq PRS disable dmaengine: idxd: add pid to exported sysfs attribute for opened file dmaengine: idxd: expose fault counters to sysfs dmaengine: idxd: add a device to represent the file opened dmaengine: idxd: add per file user counters for completion record faults dmaengine: idxd: process batch descriptor completion record faults dmaengine: idxd: add descs_completed field for completion record dmaengine: idxd: process user page faults for completion record dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling dmaengine: idxd: create kmem cache for event log fault items dmaengine: idxd: add per DSA wq workqueue for processing cr faults dmanegine: idxd: add debugfs for event log dump dmaengine: idxd: add interrupt handling for event log dmaengine: idxd: setup event log configuration dmaengine: idxd: add event log size sysfs attribute dmaengine: idxd: make misc interrupt one shot dt-bindings: dma: snps,dw-axi-dmac: constrain the items of resets for JH7110 dma dt-bindings: dma: Drop unneeded quotes dmaengine: at_xdmac: align declaration of ret with the rest of variables dmaengine: at_xdmac: add a warning message regarding for unpaused channels ...
2023-04-12	dmaengine: idxd: add per wq PRS disable	Dave Jiang
	Add sysfs knob for per wq Page Request Service disable. This knob disables PRS support for the specific wq. When this bit is set, it also overrides the wq's block on fault enabling. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-17-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: add a device to represent the file opened	Dave Jiang
	Embed a struct device for the user file context in order to export sysfs attributes related with the opened file. Tie the lifetime of the file context to the device. The sysfs entry will be added under the char device. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-14-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: add per file user counters for completion record faults	Dave Jiang
	Add counters per opened file for the char device in order to keep track how many completion record faults occurred and how many of those faults failed the writeback by the driver after attempt to fault in the page. The counters are managed by xarray that associates the PASID with struct idxd_user_context. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-13-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: process batch descriptor completion record faults	Dave Jiang
	Add event log processing for faulting of user batch descriptor completion record. When encountering an event log entry for a page fault on a completion record, the driver is expected to do the following: 1. If the "first error in batch" bit in event log entry error info is set, discard any previously recorded errors associated with the "batch identifier". 2. Fix the page fault according to the fault address in the event log. If successful, write the completion record to the fault address in user space. 3. If an error is encountered while writing the completion record and it is associated to a descriptor in the batch, the driver associates the error with the batch identifier of the event log entry and tracks it until the event log entry for the corresponding batch desc is encountered. While processing an event log entry for a batch descriptor with error indicating that one or more descs in the batch had event log entries, the driver will do the following before writing the batch completion record: 1. If the status field of the completion record is 0x1, the driver will change it to error code 0x5 (one or more operations in batch completed with status not successful) and changes the result field to 1. 2. If the status is error code 0x6 (page fault on batch descriptor list address), change the result field to 1. 3. If status is any other value, the completion record is not changed. 4. Clear the recorded error in preparation for next batch with same batch identifier. The result field is for user software to determine whether to set the "Batch Error" flag bit in the descriptor for continuation of partial batch descriptor completion. See DSA spec 2.0 for additional information. If no error has been recorded for the batch, the batch completion record is written to user space as is. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: process user page faults for completion record	Dave Jiang
	DSA supports page fault handling through PRS. However, the DMA engine that's processing the descriptor is blocked until the PRS response is received. Other workqueues sharing the engine are also blocked. Page fault handing by the driver with PRS disabled can be used to mitigate the stalling. With PRS disabled while ATS remain enabled, DSA handles page faults on a completion record by reporting an event in the event log. In this instance, the descriptor is completed and the event log contains the completion record address and the contents of the completion record. Add support to the event log handling code to fault in the completion record and copy the content of the completion record to user memory. A bitmap is introduced to keep track of discarded event log entries. When the user process initiates ->release() of the char device, it no longer is interested in any remaining event log entries tied to the relevant wq and PASID. The driver will mark the event log entry index in the bitmap. Upon encountering the entries during processing, the event log handler will just clear the bitmap bit and skip the entry rather than attempt to process the event log entry. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: add idxd_copy_cr() to copy user completion record during ↵	Fenghua Yu
	page fault handling Define idxd_copy_cr() to copy completion record to fault address in user address that is found by work queue (wq) and PASID. It will be used to write the user's completion record that the hardware device is not able to write due to user completion record page fault. An xarray is added to associate the PASID and mm with the struct idxd_user_context so mm can be found by PASID and wq. It is called when handling the completion record fault in a kernel thread context. Switch to the mm using kthread_use_vm() and copy the completion record to the mm via copy_to_user(). Once the copy is completed, switch back to the current mm using kthread_unuse_mm(). Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-9-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: create kmem cache for event log fault items	Dave Jiang
	Add a kmem cache per device for allocating event log fault context. The context allows an event log entry to be copied and passed to a software workqueue to be processed. Due to each device can have different sized event log entry depending on device type, it's not possible to have a global kmem cache. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-8-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: add per DSA wq workqueue for processing cr faults	Dave Jiang
	Add a workqueue for user submitted completion record fault processing. The workqueue creation and destruction lifetime will be tied to the user sub-driver since it will only be used when the wq is a user type. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-7-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmanegine: idxd: add debugfs for event log dump	Dave Jiang
	Add debugfs entry to dump the content of the event log for debugging. The function will dump all non-zero entries in the event log. It will note which entries are processed and which entries are still pending processing at the time of the dump. The entries may not always be in chronological order due to the log is a circular buffer. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-6-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: setup event log configuration	Dave Jiang
	Add setup of event log feature for supported device. Event log addresses error reporting that was lacking in gen 1 DSA devices where a second error event does not get reported when a first event is pending software handling. The event log allows a circular buffer that the device can push error events to. It is up to the user to create a large enough event log ring in order to capture the expected events. The evl size can be set in the device sysfs attribute. By default 64 entries are supported as minimal when event log is enabled. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12	dmaengine: idxd: add event log size sysfs attribute	Dave Jiang
	Add support for changing of the event log size. Event log is a feature added to DSA 2.0 hardware to improve error reporting. It supersedes the SWERROR register on DSA 1.0 hardware and hope to prevent loss of reported errors. The error log size determines how many error entries supported for the device. It can be configured by the user via sysfs attribute. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-03-31	dmaengine: idxd: expose IAA CAP register via sysfs knob	Dave Jiang
	Add IAA (IAX) capability mask sysfs attribute to expose to applications. The mask provides application knowledge of what capabilities this IAA device supports. This mask is available for IAA 2.0 device or later. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230303213732.3357494-3-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-03-31	dmaengine: idxd: reformat swerror output to standard Linux bitmap output	Dave Jiang
	SWERROR register is 4 64bit wide registers. Currently the sysfs attribute just outputs 4 64bit hex integers. Convert to output with %*pb format specifier. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230303213732.3357494-2-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-03-31	iommu: Remove ioasid infrastructure	Jason Gunthorpe
	This has no use anymore, delete it all. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230322200803.869130-8-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-31	iommu/ioasid: Rename INVALID_IOASID	Jacob Pan
	INVALID_IOASID and IOMMU_PASID_INVALID are duplicated. Rename INVALID_IOASID and consolidate since we are moving away from IOASID infrastructure. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230322200803.869130-7-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-08	dmaengine: idxd: Fix max batch size for Intel IAA	Xiaochen Shen
	>From Intel IAA spec [1], Intel IAA does not support batch processing. Two batch related default values for IAA are incorrect in current code: (1) The max batch size of device is set during device initialization, that indicates batch is supported. It should be always 0 on IAA. (2) The max batch size of work queue is set to WQ_DEFAULT_MAX_BATCH (32) as the default value regardless of Intel DSA or IAA device during work queue setup and cleanup. It should be always 0 on IAA. Fix the issues by setting the max batch size of device and max batch size of work queue to 0 on IAA device, that means batch is not supported. [1]: https://cdrdv2.intel.com/v1/dl/getContent/721858 Fixes: 23084545dbb0 ("dmaengine: idxd: set max_xfer and max_batch for RO device") Fixes: 92452a72ebdf ("dmaengine: idxd: set defaults for wq configs") Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220930201528.18621-2-xiaochen.shen@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: idxd: add configuration for concurrent batch descriptor processing	Dave Jiang
	Add sysfs knob to allow control of the number of batch descriptors that can be concurrently processed by an engine in the group as a fraction of the Maximum Work Descriptors in Progress value specfied in ENGCAP register. This control knob is part of toggle for QoS control. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-6-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: idxd: add configuration for concurrent work descriptor processing	Dave Jiang
	Add sysfs knob to allow control of the number of work descriptors that can be concurrently processed by an engine in the group as a fraction of the Maximum Work Descriptors in Progress value specified in ENGCAP register. This control knob is part of toggle for QoS control. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-5-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: idxd: add WQ operation cap restriction support	Dave Jiang
	DSA 2.0 add the capability of configuring DMA ops on a per workqueue basis. This means that certain ops can be disabled by the system administrator for certain wq. By default, all ops are available. A bitmap is used to store the ops due to total op size of 256 bits and it is more convenient to use a range list to specify which bits are enabled. One of the usage to support this is for VM migration between different iteration of devices. The newer ops are disabled in order to allow guest to migrate to a host that only support older ops. Another usage is to restrict the WQ to certain operations for QoS of performance. A sysfs of ops_config attribute is added per wq. It is only usable when the ops_config bit is set under WQ_CAP register. This means that this attribute will return -EOPNOTSUPP on DSA 1.x devices. The expected input is a range list for the bits per operation the WQ supports. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-4-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmanegine: idxd: reformat opcap output to match bitmap_parse() input	Dave Jiang
	To make input and output consistent and prepping for the per WQ operation configuration support, change the output of opcap display to match the input that is expected by bitmap_parse() helper function. The output will be a bitmap with field width as the number of bits using the %*pb format specifier for printk() family. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-3-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: idxd: convert ats_dis to a wq flag	Dave Jiang
	Make wq attributes access consistent. Convert ats_dis to wq flag WQ_FLAG_ATS_DISABLE. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-2-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: idxd: track enabled workqueues in bitmap	Jerry Snitselaar
	Now that idxd_wq_disable_cleanup() sets the workqueue state to IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been enabled. This will then be used to determine which workqueues should be re-enabled when attempting a software reset to recover from a device halt state. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16	dmaengine: idxd: make idxd_register/unregister_dma_channel() static	Dave Jiang
	Since idxd_register/unregister_dma_channel() are only called locally, make them static. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165187583222.3287435.12882651040433040246.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16	dmaengine: idxd: Separate user and kernel pasid enabling	Dave Jiang
	The idxd driver always gated the pasid enabling under a single knob and this assumption is incorrect. The pasid used for kernel operation can be independently toggled and has no dependency on the user pasid (and vice versa). Split the two so they are independent "enabled" flags. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165231431746.986466.5666862038354800551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-04-22	dmaengine: idxd: refactor wq driver enable/disable operations	Dave Jiang
	Move the core driver operations from wq driver to the drv_enable_wq() and drv_disable_wq() functions. The move should reduce the wq driver's knowledge of the core driver operations and prevent code confusion for future wq drivers. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165047301643.3841827.11222723219862233060.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05	dmaengine: idxd: change bandwidth token to read buffers	Dave Jiang
	DSA spec v1.2 has changed the term of "bandwidth tokens" to "read buffers" in order to make the concept clearer. Deprecate bandwidth token naming in the driver and convert to read buffers in order to match with the spec and reduce confusion when reading the spec. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163951338932.2988321.6162640806935567317.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05	dmaengine: idxd: change MSIX allocation based on per wq activation	Dave Jiang
	Change the driver where WQ interrupt is requested only when wq is being enabled. This new scheme set things up so that request_threaded_irq() is only called when a kernel wq type is being enabled. This also sets up for future interrupt request where different interrupt handler such as wq occupancy interrupt can be setup instead of the wq completion interrupt. Not calling request_irq() until the WQ actually needs an irq also prevents wasting of CPU irq vectors on x86 systems, which is a limited resource. idxd_flush_pending_descs() is moved to device.c since descriptor flushing is now part of wq disable rather than shutdown(). Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05	dmaengine: idxd: embed irq_entry in idxd_wq struct	Dave Jiang
	With irq_entry already being associated with the wq in a 1:1 relationship, embed the irq_entry in the idxd_wq struct and remove back pointers for idxe_wq and idxd_device. In the process of this work, clean up the interrupt handle assignment so that there's no decision to be made during submit call on where interrupt handle value comes from. Set the interrupt handle during irq request initialization time. irq_entry 0 is designated as special and is tied to the device itself. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-12-17	dmaengine: idxd: add knob for enqcmds retries	Dave Jiang
	Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS descriptor submission. While on host, it is not as likely that ENQCMDS return busy during normal operations due to the driver controlling the number of descriptors allocated for submission. However, when the driver is operating as a guest driver, the chance of retry goes up significantly due to sharing a wq with multiple VMs. A default value is provided with the system admin being able to tune the value on a per WQ basis. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-12-17	dmaengine: idxd: set defaults for wq configs	Dave Jiang
	Add default values for wq size, max_xfer_size and max_batch_size. These values should provide a general guidance for the wq configuration when the user does not specify any specific values. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528473483.3926048.7950067926287180976.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: handle interrupt handle revoked event	Dave Jiang
	"Interrupt handle revoked" is an event that happens when the driver is running on a guest kernel and the VM is migrated to a new machine. The device will trigger an interrupt that signals to the guest driver that the interrupt handles need to be replaced. The misc irq thread function calls a helper function to handle the event. The function uses the WQ percpu_ref to quiesce the kernel submissions. It then replaces the interrupt handles by requesting interrupt handle command for each I/O MSIX vector. Once the handle is updated, the driver will unblock the submission path to allow new submissions. The submitter will attempt to acquire a percpu_ref before submission. When the request fails, it will wait on the wq_resurrect 'completion'. The driver does anticipate the possibility of descriptors being submitted before the WQ percpu_ref is killed. If a descriptor has already been submitted, it will return with incorrect interrupt handle status. The descriptor will be re-submitted with the new interrupt handle on the completion path. For descriptors with incorrect interrupt handles, completion interrupt won't be triggered. At the completion of the interrupt handle refresh, the handling function will call idxd_int_handle_refresh_drain() to issue drain descriptors to each of the wq with associated interrupt handle. The drain descriptor will have interrupt request set but without completion record. This will ensure all descriptors with incorrect interrupt completion handle get drained and a completion interrupt is triggered for the guest driver to process them. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: handle invalid interrupt handle descriptors	Dave Jiang
	Handle a descriptor that has been marked with invalid interrupt handle error in status. Create a work item that will resubmit the descriptor. This typically happens when the driver has handled the revoke interrupt handle event and has a new interrupt handle. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528419601.3925689.4166517602890523193.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22	dmaengine: idxd: create locked version of idxd_quiesce() call	Dave Jiang
	Add a locked version of idxd_quiesce() call so that the quiesce can be called with a lock in situations where the lock is not held by the caller. In the driver probe/remove path, the lock is already held, so the raw version can be called w/o locking. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528418980.3925689.5841907054957931211.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>