lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2021-10-07	qcom_scm: hide Kconfig symbol	Arnd Bergmann
	Now that SCM can be a loadable module, we have to add another dependency to avoid link failures when ipa or adreno-gpu are built-in: aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_probe': ipa_main.c:(.text+0xfc4): undefined reference to `qcom_scm_is_available' ld.lld: error: undefined symbol: qcom_scm_is_available >>> referenced by adreno_gpu.c >>> gpu/drm/msm/adreno/adreno_gpu.o:(adreno_zap_shader_load) in archive drivers/built-in.a This can happen when CONFIG_ARCH_QCOM is disabled and we don't select QCOM_MDT_LOADER, but some other module selects QCOM_SCM. Ideally we'd use a similar dependency here to what we have for QCOM_RPROC_COMMON, but that causes dependency loops from other things selecting QCOM_SCM. This appears to be an endless problem, so try something different this time: - CONFIG_QCOM_SCM becomes a hidden symbol that nothing 'depends on' but that is simply selected by all of its users - All the stubs in include/linux/qcom_scm.h can go away - arm-smccc.h needs to provide a stub for __arm_smccc_smc() to allow compile-testing QCOM_SCM on all architectures. - To avoid a circular dependency chain involving RESET_CONTROLLER and PINCTRL_SUNXI, drop the 'select RESET_CONTROLLER' statement. According to my testing this still builds fine, and the QCOM platform selects this symbol already. Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Alex Elder <elder@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-20	Merge branches 'apple/dart', 'arm/smmu', 'iommu/fixes', 'x86/amd', ↵	Joerg Roedel
	'x86/vt-d' and 'core' into next
2021-08-20	iommu/arm-smmu: Fix missing unlock on error in arm_smmu_device_group()	Yang Yingliang
	Add the missing unlock before return from function arm_smmu_device_group() in the error handling case. Fixes: b1a1347912a7 ("iommu/arm-smmu: Fix race condition during iommu_group creation") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210820074949.1946576-1-yangyingliang@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18	iommu/arm-smmu: Prepare for multiple DMA domain types	Robin Murphy
	In preparation for the strict vs. non-strict decision for DMA domains to be expressed in the domain type, make sure we expose our flush queue awareness by accepting the new domain type. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/8f217ef285bd0bb9456c27ef622d2efdbbca1ad8.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18	iommu/io-pgtable: Remove non-strict quirk	Robin Murphy
	IO_PGTABLE_QUIRK_NON_STRICT was never a very comfortable fit, since it's not a quirk of the pagetable format itself. Now that we have a more appropriate way to convey non-strict unmaps, though, this last of the non-quirk quirks can also go, and with the flush queue code also now enforcing its own ordering we can have a lovely cleanup all round. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/155b5c621cd8936472e273a8b07a182f62c6c20d.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18	iommu/arm-smmu: Drop IOVA cookie management	Robin Murphy
	The core code bakes its own cookies now. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/7ae3680dad9735cc69c3618866666896bd11e031.1628682048.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-13	iommu/arm-smmu-v3: Stop pre-zeroing batch commands	John Garry
	Pre-zeroing the batched commands structure is inefficient, as individual commands are zeroed later in arm_smmu_cmdq_build_cmd(). The size is quite large and commonly most commands won't even be used: struct arm_smmu_cmdq_batch cmds = {}; 345c: 52800001 mov w1, #0x0 // #0 3460: d2808102 mov x2, #0x408 // #1032 3464: 910143a0 add x0, x29, #0x50 3468: 94000000 bl 0 <memset> Stop pre-zeroing the complete structure and only zero the num member. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1628696966-88386-1-git-send-email-john.garry@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-08-13	iommu/arm-smmu-v3: Extract reusable function __arm_smmu_cmdq_skip_err()	Zhen Lei
	When SMMU_GERROR.CMDQP_ERR is different to SMMU_GERRORN.CMDQP_ERR, it indicates that one or more errors have been encountered on a command queue control page interface. We need to traverse all ECMDQs in that control page to find all errors. For each ECMDQ error handling, it is much the same as the CMDQ error handling. This common processing part is extracted as a new function __arm_smmu_cmdq_skip_err(). Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210811114852.2429-5-thunder.leizhen@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-08-13	iommu/arm-smmu-v3: Add and use static helper function arm_smmu_get_cmdq()	Zhen Lei
	One SMMU has only one normal CMDQ. Therefore, this CMDQ is used regardless of the core on which the command is inserted. It can be referenced directly through "smmu->cmdq". However, one SMMU has multiple ECMDQs, and the ECMDQ used by the core on which the command insertion is executed may be different. So the helper function arm_smmu_get_cmdq() is added, which returns the CMDQ/ECMDQ that the current core should use. Currently, the code that supports ECMDQ is not added. just simply returns "&smmu->cmdq". Many subfunctions of arm_smmu_cmdq_issue_cmdlist() use "&smmu->cmdq" or "&smmu->cmdq.q" directly. To support ECMDQ, they need to call the newly added function arm_smmu_get_cmdq() instead. Note that normal CMDQ is still required until ECMDQ is available. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210811114852.2429-4-thunder.leizhen@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-08-13	iommu/arm-smmu-v3: Add and use static helper function ↵	Zhen Lei
	arm_smmu_cmdq_issue_cmd_with_sync() The obvious key to the performance optimization of commit 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is to allow multiple cores to insert commands in parallel after a brief mutex contention. Obviously, inserting as many commands at a time as possible can reduce the number of times the mutex contention participates, thereby improving the overall performance. At least it reduces the number of calls to function arm_smmu_cmdq_issue_cmdlist(). Therefore, function arm_smmu_cmdq_issue_cmd_with_sync() is added to insert the 'cmd+sync' commands at a time. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210811114852.2429-3-thunder.leizhen@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-08-13	iommu/arm-smmu-v3: Use command queue batching helpers to improve performance	Zhen Lei
	The obvious key to the performance optimization of commit 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is to allow multiple cores to insert commands in parallel after a brief mutex contention. Obviously, inserting as many commands at a time as possible can reduce the number of times the mutex contention participates, thereby improving the overall performance. At least it reduces the number of calls to function arm_smmu_cmdq_issue_cmdlist(). Therefore, use command queue batching helpers to insert multiple commands at a time. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210811114852.2429-2-thunder.leizhen@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-08-13	iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation	Sai Prakash Ranjan
	Currently for iommu_unmap() of large scatter-gather list with page size elements, the majority of time is spent in flushing of partial walks in __arm_lpae_unmap() which is a VA based TLB invalidation invalidating page-by-page on iommus like arm-smmu-v2 (TLBIVA). For example: to unmap a 32MB scatter-gather list with page size elements (8192 entries), there are 16->2MB buffer unmaps based on the pgsize (2MB for 4K granule) and each of 2MB will further result in 512 TLBIVAs (2MB/4K) resulting in a total of 8192 TLBIVAs (512*16) for 16->2MB causing a huge overhead. On qcom implementation, there are several performance improvements for TLB cache invalidations in HW like wait-for-safe (for realtime clients such as camera and display) and few others to allow for cache lookups/updates when TLBI is in progress for the same context bank. So the cost of over-invalidation is less compared to the unmap latency on several usecases like camera which deals with large buffers. So, ASID based TLB invalidations (TLBIASID) can be used to invalidate the entire context for partial walk flush thereby improving the unmap latency. For this example of 32MB scatter-gather list unmap, this change results in just 16 ASID based TLB invalidations (TLBIASIDs) as opposed to 8192 TLBIVAs thereby increasing the performance of unmaps drastically. Test on QTI SM8150 SoC for 10 iterations of iommu_{map_sg}/unmap: (average over 10 iterations) Before this optimization: size iommu_map_sg iommu_unmap 4K 2.067 us 1.854 us 64K 9.598 us 8.802 us 1M 148.890 us 130.718 us 2M 305.864 us 67.291 us 12M 1793.604 us 390.838 us 16M 2386.848 us 518.187 us 24M 3563.296 us 775.989 us 32M 4747.171 us 1033.364 us After this optimization: size iommu_map_sg iommu_unmap 4K 1.723 us 1.765 us 64K 9.880 us 8.869 us 1M 155.364 us 135.223 us 2M 303.906 us 5.385 us 12M 1786.557 us 21.250 us 16M 2391.890 us 27.437 us 24M 3570.895 us 39.937 us 32M 4755.234 us 51.797 us Real world data also shows big difference in unmap performance as below: There were reports of camera frame drops because of high overhead in iommu unmap without this optimization because of frequent unmaps issued by camera of about 100MB/s taking more than 100ms thereby causing frame drops. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Link: https://lore.kernel.org/r/20210811160426.10312-1-saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
2021-08-10	iommu/arm-smmu: Fix race condition during iommu_group creation	Krishna Reddy
	When two devices with same SID are getting probed concurrently through iommu_probe_device(), the iommu_group sometimes is getting allocated more than once as call to arm_smmu_device_group() is not protected for concurrency. Furthermore, it leads to each device holding a different iommu_group and domain pointer, separate IOVA space and only one of the devices' domain is used for translations from IOMMU. This causes accesses from other device to fault or see incorrect translations. Fix this by protecting iommu_group allocation from concurrency in arm_smmu_device_group(). Signed-off-by: Krishna Reddy <vdumpa@nvidia.com> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com> Link: https://lore.kernel.org/r/1628570641-9127-3-git-send-email-amhetre@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2021-08-10	iommu/arm-smmu: Add clk_bulk_{prepare/unprepare} to system pm callbacks	Sai Prakash Ranjan
	Some clocks for SMMU can have parent as XO such as gpu_cc_hub_cx_int_clk of GPU SMMU in QTI SC7280 SoC and in order to enter deep sleep states in such cases, we would need to drop the XO clock vote in unprepare call and this unprepare callback for XO is in RPMh (Resource Power Manager-Hardened) clock driver which controls RPMh managed clock resources for new QTI SoCs. Given we cannot have a sleeping calls such as clk_bulk_prepare() and clk_bulk_unprepare() in arm-smmu runtime pm callbacks since the iommu operations like map and unmap can be in atomic context and are in fast path, add this prepare and unprepare call to drop the XO vote only for system pm callbacks since it is not a fast path and we expect the system to enter deep sleep states with system pm as opposed to runtime pm. This is a similar sequence of clock requests (prepare,enable and disable,unprepare) in arm-smmu probe and remove. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Co-developed-by: Rajendra Nayak <rnayak@codeaurora.org> Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Link: https://lore.kernel.org/r/20210810064808.32486-1-saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
2021-08-02	iommu/arm-smmu-v3: Implement the map_pages() IOMMU driver callback	Xiang Chen
	Implement the map_pages() callback for ARM SMMUV3 driver to allow calls from iommu_map to map multiple pages of the same size in one call. Also remove the map() callback for the ARM SMMUV3 driver as it will no longer be used. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/1627697831-158822-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02	iommu/arm-smmu-v3: Implement the unmap_pages() IOMMU driver callback	Xiang Chen
	Implement the unmap_pages() callback for ARM SMMUV3 driver to allow calls from iommu_unmap to unmap multiple pages of the same size in one call. Also remove the unmap() callback for the ARM SMMUV3 driver as it will no longer be used. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/1627697831-158822-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02	iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()	John Garry
	Members of struct "llq" will be zero-inited, apart from member max_n_shift. But we write llq.val straight after the init, so it was pointless to zero init those other members. As such, separately init member max_n_shift only. In addition, struct "head" is initialised to "llq" only so that member max_n_shift is set. But that member is never referenced for "head", so remove any init there. Removing these initializations is seen as a small performance optimisation, as this code is (very) hot path. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1624293394-202509-1-git-send-email-john.garry@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-07-26	iommu: Streamline iommu_iova_to_phys()	Robin Murphy
	If people are going to insist on calling iommu_iova_to_phys() pointlessly and expecting it to work, we can at least do ourselves a favour by handling those cases in the core code, rather than repeatedly across an inconsistent handful of drivers. Since all the existing drivers implement the internal callback, and any future ones are likely to want to work with iommu-dma which relies on iova_to_phys a fair bit, we may as well remove that currently-redundant check as well and consider it mandatory. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/f564f3f6ff731b898ff7a898919bf871c2c7745a.1626354264.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26	iommu/arm-smmu: Implement the map_pages() IOMMU driver callback	Isaac J. Manjarres
	Implement the map_pages() callback for the ARM SMMU driver to allow calls from iommu_map to map multiple pages of the same size in one call. Also, remove the map() callback for the ARM SMMU driver, as it will no longer be used. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-16-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26	iommu/arm-smmu: Implement the unmap_pages() IOMMU driver callback	Isaac J. Manjarres
	Implement the unmap_pages() callback for the ARM SMMU driver to allow calls from iommu_unmap to unmap multiple pages of the same size in one call. Also, remove the unmap() callback for the SMMU driver, as it will no longer be used. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-15-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-15	Merge tag 'Wimplicit-fallthrough-clang-5.14-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo Silva: "This fixes many fall-through warnings when building with Clang and -Wimplicit-fallthrough, and also enables -Wimplicit-fallthrough for Clang, globally. It's also important to notice that since we have adopted the use of the pseudo-keyword macro fallthrough, we also want to avoid having more /* fall through / comments being introduced. Contrary to GCC, Clang doesn't recognize any comments as implicit fall-through markings when the -Wimplicit-fallthrough option is enabled. So, in order to avoid having more comments being introduced, we use the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang, will cause a warning in case a code comment is intended to be used as a fall-through marking. The patch for Makefile also enforces this. We had almost 4,000 of these issues for Clang in the beginning, and there might be a couple more out there when building some architectures with certain configurations. However, with the recent fixes I think we are in good shape and it is now possible to enable the warning for Clang" tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits) Makefile: Enable -Wimplicit-fallthrough for Clang powerpc/smp: Fix fall-through warning for Clang dmaengine: mpc512x: Fix fall-through warning for Clang usb: gadget: fsl_qe_udc: Fix fall-through warning for Clang powerpc/powernv: Fix fall-through warning for Clang MIPS: Fix unreachable code issue MIPS: Fix fall-through warnings for Clang ASoC: Mediatek: MT8183: Fix fall-through warning for Clang power: supply: Fix fall-through warnings for Clang dmaengine: ti: k3-udma: Fix fall-through warning for Clang s390: Fix fall-through warnings for Clang dmaengine: ipu: Fix fall-through warning for Clang iommu/arm-smmu-v3: Fix fall-through warning for Clang mmc: jz4740: Fix fall-through warning for Clang PCI: Fix fall-through warning for Clang scsi: libsas: Fix fall-through warning for Clang video: fbdev: Fix fall-through warning for Clang math-emu: Fix fall-through warning cpufreq: Fix fall-through warning for Clang drm/msm: Fix fall-through warning in msm_gem_new_impl() ...
2021-07-14	iommu/qcom: Revert "iommu/arm: Cleanup resources in case of probe error path"	Marek Szyprowski
	QCOM IOMMU driver calls bus_set_iommu() for every IOMMU device controller, what fails for the second and latter IOMMU devices. This is intended and must be not fatal to the driver registration process. Also the cleanup path should take care of the runtime PM state, what is missing in the current patch. Revert relevant changes to the QCOM IOMMU driver until a proper fix is prepared. This partially reverts commit 249c9dc6aa0db74a0f7908efd04acf774e19b155. Fixes: 249c9dc6aa0d ("iommu/arm: Cleanup resources in case of probe error path") Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210705065657.30356-1-m.szyprowski@samsung.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-13	iommu/arm-smmu-v3: Fix fall-through warning for Clang	Gustavo A. R. Silva
	Fix the following fallthrough warning (arm64-randconfig with Clang): drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:382:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-10	Merge tag 'arm-drivers-5.14' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM driver updates from Olof Johansson: - Reset controllers: Adding support for Microchip Sparx5 Switch. - Memory controllers: ARM Primecell PL35x SMC memory controller driver cleanups and improvements. - i.MX SoC drivers: Power domain support for i.MX8MM and i.MX8MN. - Rockchip: RK3568 power domains support + DT binding updates, cleanups. - Qualcomm SoC drivers: Amend socinfo with more SoC/PMIC details, including support for MSM8226, MDM9607, SM6125 and SC8180X. - ARM FFA driver: "Firmware Framework for ARMv8-A", defining management interfaces and communication (including bus model) between partitions both in Normal and Secure Worlds. - Tegra Memory controller changes, including major rework to deal with identity mappings at boot and integration with ARM SMMU pieces. * tag 'arm-drivers-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (120 commits) firmware: turris-mox-rwtm: add marvell,armada-3700-rwtm-firmware compatible string firmware: turris-mox-rwtm: show message about HWRNG registration firmware: turris-mox-rwtm: fail probing when firmware does not support hwrng firmware: turris-mox-rwtm: report failures better firmware: turris-mox-rwtm: fix reply status decoding function soc: imx: gpcv2: add support for i.MX8MN power domains dt-bindings: add defines for i.MX8MN power domains firmware: tegra: bpmp: Fix Tegra234-only builds iommu/arm-smmu: Use Tegra implementation on Tegra186 iommu/arm-smmu: tegra: Implement SID override programming iommu/arm-smmu: tegra: Detect number of instances at runtime dt-bindings: arm-smmu: Add Tegra186 compatible string firmware: qcom_scm: Add MDM9607 compatible soc: qcom: rpmpd: Add MDM9607 RPM Power Domains soc: renesas: Add support to read LSI DEVID register of RZ/G2{L,LC} SoC's soc: renesas: Add ARCH_R9A07G044 for the new RZ/G2L SoC's dt-bindings: soc: rockchip: drop unnecessary #phy-cells from grf.yaml memory: emif: remove unused frequency and voltage notifiers memory: fsl_ifc: fix leak of private memory on probe failure memory: fsl_ifc: fix leak of IO mapping on probe failure ...
2021-07-02	Merge tag 'iommu-updates-v5.14' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - SMMU Updates from Will Deacon: - SMMUv3: - Support stalling faults for platform devices - Decrease defaults sizes for the event and PRI queues - SMMUv2: - Support for a new '->probe_finalize' hook, needed by Nvidia - Even more Qualcomm compatible strings - Avoid Adreno TTBR1 quirk for DB820C platform - Intel VT-d updates from Lu Baolu: - Convert Intel IOMMU to use sva_lib helpers in iommu core - ftrace and debugfs supports for page fault handling - Support asynchronous nested capabilities - Various misc cleanups - Support for new VIOT ACPI table to make the VirtIO IOMMU available on x86 - Add the amd_iommu=force_enable command line option to enable the IOMMU on platforms where they are known to cause problems - Support for version 2 of the Rockchip IOMMU - Various smaller fixes, cleanups and refactorings * tag 'iommu-updates-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits) iommu/virtio: Enable x86 support iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops() ACPI: Add driver for the VIOT table ACPI: Move IOMMU setup code out of IORT ACPI: arm64: Move DMA setup operations out of IORT iommu/vt-d: Fix dereference of pointer info before it is null checked iommu: Update "iommu.strict" documentation iommu/arm-smmu: Check smmu->impl pointer before dereferencing iommu/arm-smmu-v3: Remove unnecessary oom message iommu/arm-smmu: Fix arm_smmu_device refcount leak in address translation iommu/arm-smmu: Fix arm_smmu_device refcount leak when arm_smmu_rpm_get fails iommu/vt-d: Fix linker error on 32-bit iommu/vt-d: No need to typecast iommu/vt-d: Define counter explicitly as unsigned int iommu/vt-d: Remove unnecessary braces iommu/vt-d: Removed unused iommu_count in dmar domain iommu/vt-d: Use bitfields for DMAR capabilities iommu/vt-d: Use DEVICE_ATTR_RO macro iommu/vt-d: Fix out-bounds-warning in intel/svm.c iommu/vt-d: Add PRQ handling latency sampling ...
2021-06-25	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/smmu', 'x86/vt-d', ↵	Joerg Roedel
	'x86/amd', 'virtio' and 'core' into next
2021-06-23	iommu/arm-smmu-qcom: Add stall support	Rob Clark
	Add, via the adreno-smmu-priv interface, a way for the GPU to request the SMMU to stall translation on faults, and then later resume the translation, either retrying or terminating the current translation. This will be used on the GPU side to "freeze" the GPU while we snapshot useful state for devcoredump. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jordan Crouse <jordan@cosmicpenguin.net> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210610214431.539029-5-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-23	iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info	Jordan Crouse
	Add a callback in adreno-smmu-priv to read interesting SMMU registers to provide an opportunity for a richer debug experience in the GPU driver. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210610214431.539029-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-23	iommu/arm-smmu: Add support for driver IOMMU fault handlers	Jordan Crouse
	Call report_iommu_fault() to allow upper-level drivers to register their own fault handlers. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210610214431.539029-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-16	Merge branch 'for-thierry/arm-smmu' into for-joerg/arm-smmu/updates	Will Deacon
	Merge in support for the Arm SMMU '->probe_finalize()' implementation callback, which is required to prevent early faults in conjunction with Nvidia's memory controller. * for-thierry/arm-smmu: iommu/arm-smmu: Check smmu->impl pointer before dereferencing iommu/arm-smmu: Implement ->probe_finalize()
2021-06-15	iommu/arm-smmu: Check smmu->impl pointer before dereferencing	Will Deacon
	Commit 0d97174aeadf ("iommu/arm-smmu: Implement ->probe_finalize()") added a new optional ->probe_finalize callback to 'struct arm_smmu_impl' but neglected to check that 'smmu->impl' is present prior to checking if the new callback is present. Add the missing check, which avoids dereferencing NULL when probing an SMMU which doesn't require any implementation-specific callbacks: \| Unable to handle kernel NULL pointer dereference at virtual address \| 0000000000000070 \| \| Call trace: \| arm_smmu_probe_finalize+0x14/0x48 \| of_iommu_configure+0xe4/0x1b8 \| of_dma_configure_id+0xf8/0x2d8 \| pci_dma_configure+0x44/0x88 \| really_probe+0xc0/0x3c0 Fixes: 0d97174aeadf ("iommu/arm-smmu: Implement ->probe_finalize()") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Will Deacon <will@kernel.org>
2021-06-15	iommu/arm-smmu-v3: Remove unnecessary oom message	Zhen Lei
	Fixes scripts/checkpatch.pl warning: WARNING: Possible unnecessary 'out of memory' message Remove it can help us save a bit of memory. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210609125438.14369-1-thunder.leizhen@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11	iommu/arm-smmu: Fix arm_smmu_device refcount leak in address translation	Xiyu Yang
	The reference counting issue happens in several exception handling paths of arm_smmu_iova_to_phys_hard(). When those error scenarios occur, the function forgets to decrease the refcount of "smmu" increased by arm_smmu_rpm_get(), causing a refcount leak. Fix this issue by jumping to "out" label when those error scenarios occur. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/1623293391-17261-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11	iommu/arm-smmu: Fix arm_smmu_device refcount leak when arm_smmu_rpm_get fails	Xiyu Yang
	arm_smmu_rpm_get() invokes pm_runtime_get_sync(), which increases the refcount of the "smmu" even though the return value is less than 0. The reference counting issue happens in some error handling paths of arm_smmu_rpm_get() in its caller functions. When arm_smmu_rpm_get() fails, the caller functions forget to decrease the refcount of "smmu" increased by arm_smmu_rpm_get(), causing a refcount leak. Fix this issue by calling pm_runtime_resume_and_get() instead of pm_runtime_get_sync() in arm_smmu_rpm_get(), which can keep the refcount balanced in case of failure. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Link: https://lore.kernel.org/r/1623293672-17954-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11	iommu/arm-smmu: Use Tegra implementation on Tegra186	Thierry Reding
	Tegra186 requires the same SID override programming as Tegra194 in order to seamlessly transition from the firmware framebuffer to the Linux framebuffer, so the Tegra implementation needs to be used on Tegra186 devices as well. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210603164632.1000458-7-thierry.reding@gmail.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
2021-06-11	iommu/arm-smmu: tegra: Implement SID override programming	Thierry Reding
	The secure firmware keeps some SID override registers set as passthrough in order to allow devices such as the display controller to operate with no knowledge of SMMU translations until an operating system driver takes over. This is needed in order to seamlessly transition from the firmware framebuffer to the OS framebuffer. Upon successfully attaching a device to the SMMU and in the process creating identity mappings for memory regions that are being accessed, the Tegra implementation will call into the memory controller driver to program the override SIDs appropriately. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210603164632.1000458-6-thierry.reding@gmail.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
2021-06-11	iommu/arm-smmu: tegra: Detect number of instances at runtime	Thierry Reding
	Parse the reg property in device tree and detect the number of instances represented by a device tree node. This is subsequently needed in order to support single-instance SMMUs with the Tegra implementation because additional programming is needed to properly configure the SID override registers in the memory controller. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210603164632.1000458-5-thierry.reding@gmail.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
2021-06-09	iommu/arm-smmu-qcom: Protect acpi_match_platform_list() call with CONFIG_ACPI	Shawn Guo
	The struct acpi_platform_list and function acpi_match_platform_list() defined in include/linux/acpi.h are available only when CONFIG_ACPI is enabled. Add protection to fix the build issues with !CONFIG_ACPI. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Link: https://lore.kernel.org/r/20210609015511.3955-1-shawn.guo@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm: Cleanup resources in case of probe error path	Amey Narkhede
	If device registration fails, remove sysfs attribute and if setting bus callbacks fails, unregister the device and cleanup the sysfs attribute. Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com> Link: https://lore.kernel.org/r/20210608164559.204023-1-ameynarkhede03@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-qcom: Move the adreno smmu specific impl	Sai Prakash Ranjan
	Adreno(GPU) SMMU and APSS(Application Processor SubSystem) SMMU both implement "arm,mmu-500" in some QTI SoCs and to run through adreno smmu specific implementation such as enabling split pagetables support, we need to match the "qcom,adreno-smmu" compatible first before apss smmu or else we will be running apps smmu implementation for adreno smmu and the additional features for adreno smmu is never set. For ex: we have "qcom,sc7280-smmu-500" compatible for both apps and adreno smmu implementing "arm,mmu-500", so the adreno smmu implementation is never reached because the current sequence checks for apps smmu compatible(qcom,sc7280-smmu-500) first and runs that specific impl and we never reach adreno smmu specific implementation. Suggested-by: Akhil P Oommen <akhilpo@codeaurora.org> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Jordan Crouse <jordan@cosmicpenguin.net> Link: https://lore.kernel.org/r/c42181d313fdd440011541a28cde8cd10fffb9d3.1623155117.git.saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-qcom: Add SC7280 SMMU compatible	Sai Prakash Ranjan
	Add compatible for SC7280 SMMU to use the Qualcomm Technologies, Inc. specific implementation. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/53a50cd91c97b5b598a73941985b79b51acefa14.1623155117.git.saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu: Drop unnecessary of_iommu.h includes	Rob Herring
	The only place of_iommu.h is needed is in drivers/of/device.c. Remove it from everywhere else. Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Rob Clark <robdclark@gmail.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Yong Wu <yong.wu@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: iommu@lists.linux-foundation.org Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20210527193710.1281746-2-robh@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-08	iommu/arm-smmu-v3: Decrease the queue size of evtq and priq	Zhen Lei
	Commit d25f6ead162e ("iommu/arm-smmu-v3: Increase maximum size of queues") expands the cmdq queue size to improve the success rate of concurrent command queue space allocation by multiple cores. However, this extension does not apply to evtq and priq, because for both of them, the SMMU driver is the consumer. Instead, memory resources are wasted. Therefore, the queue size of evtq and priq is restored to the original setting, one page. Fixes: d25f6ead162e ("iommu/arm-smmu-v3: Increase maximum size of queues") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210531123553.9602-1-thunder.leizhen@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-v3: Ratelimit event dump	Jean-Philippe Brucker
	When a device or driver misbehaves, it is possible to receive DMA fault events much faster than we can print them out, causing a lock up of the system and inability to cancel the source of the problem. Ratelimit printing of events to help recovery. Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20210531095648.118282-1-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-v3: Add stall support for platform devices	Jean-Philippe Brucker
	The SMMU provides a Stall model for handling page faults in platform devices. It is similar to PCIe PRI, but doesn't require devices to have their own translation cache. Instead, faulting transactions are parked and the OS is given a chance to fix the page tables and retry the transaction. Enable stall for devices that support it (opt-in by firmware). When an event corresponds to a translation error, call the IOMMU fault handler. If the fault is recoverable, it will call us back to terminate or continue the stall. To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which initializes the fault queue for the device. Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20210526161927.24268-4-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-qcom: Skip the TTBR1 quirk for db820c.	Eric Anholt
	db820c wants to use the qcom smmu path to get HUPCF set (which keeps the GPU from wedging and then sometimes wedging the kernel after a page fault), but it doesn't have separate pagetables support yet in drm/msm so we can't go all the way to the TTBR1 path. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210326231303.3071950-1-eric@anholt.net Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-qcom: Add sm6125 compatible	Martin Botka
	Add compatible for SM6125 SoC Signed-off-by: Martin Botka <martin.botka@somainline.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210523212535.740979-1-martin.botka@somainline.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu: Implement ->probe_finalize()	Thierry Reding
	Implement a ->probe_finalize() callback that can be used by vendor implementations to perform extra programming necessary after devices have been attached to the SMMU. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210603164632.1000458-4-thierry.reding@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-v3: Change array into const array	Bixuan Cui
	Fix checkpatch warning in arm-smmu-v3.c: static const char * array should probably be static const char * const Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08	iommu/arm-smmu-qcom: hook up qcom_smmu_impl for ACPI boot	Shawn Guo
	The hookup with qcom_smmu_impl is required to do ACPI boot on SC8180X based devices like Lenovo Flex 5G laptop and Microsoft Surface Pro X. Define acpi_platform_list for these platforms and match them using acpi_match_platform_list() call, and create qcom_smmu_impl accordingly. (np == NULL) is used to check ACPI boot, because fwnode of SMMU device is a static allocation and thus helpers like has_acpi_companion() don't work here. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210509022607.17534-1-shawn.guo@linaro.org Signed-off-by: Will Deacon <will@kernel.org>