lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2022-05-31	net/mlx5: CT: Fix header-rewrite re-use for tupels	Paul Blakey
	Tuple entries that don't have nat configured for them which are added to the ct nat table will always create a new modify header, as we don't check for possible re-use on them. The same for tuples that have nat configured for them but are added to ct table. Fix the above by only avoiding wasteful re-use lookup for actually natted entries in ct nat table. Fixes: 7fac5c2eced3 ("net/mlx5: CT: Avoid reusing modify header context for natted entries") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31	net/mlx5e: TC NIC mode, fix tc chains miss table	Maor Dickman
	The cited commit changed promisc table to be created on demand with the highest priority in the NIC table replacing the vlan table, this caused tc NIC tables miss flow to skip the prmoisc table because it use vlan table as miss table. OVS offload in NIC mode use promisc by default so any unicast packet which will be handled by tc NIC tables miss flow will skip the promisc rule and will be dropped. Fix this by adding new empty table in new tc level with low priority and point the nic tc chain miss to it, the new table is managed so it will point to vlan table if promisc is disabled and to promisc table if enabled. Fixes: 1c46d7409f30 ("net/mlx5e: Optimize promiscuous mode") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31	net/mlx5: Don't use already freed action pointer	Leon Romanovsky
	The call to mlx5dr_action_destroy() releases "action" memory. That pointer is set to miss_action later and generates the following smatch error: drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c:53 set_miss_action() warn: 'action' was already freed. Make sure that the pointer is always valid by setting NULL after destroy. Fixes: 6a48faeeca10 ("net/mlx5: Add direct rule fs_cmd implementation") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31	dm verity: set DM_TARGET_IMMUTABLE feature flag	Sarthak Kukreti
	The device-mapper framework provides a mechanism to mark targets as immutable (and hence fail table reloads that try to change the target type). Add the DM_TARGET_IMMUTABLE flag to the dm-verity target's feature flags to prevent switching the verity target with a different target type. Fixes: a4ffc152198e ("dm: add verity target") Cc: stable@vger.kernel.org Signed-off-by: Sarthak Kukreti <sarthakkukreti@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-05-31	PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299	Bjorn Helgaas
	92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") omitted braces around the new Elo i2 entry, so it overwrote the existing Gigabyte X299 entry. Add the appropriate braces. Found by: $ make W=1 drivers/pci/pci.o CC drivers/pci/pci.o drivers/pci/pci.c:2974:12: error: initialized field overwritten [-Werror=override-init] 2974 \| .ident = "Elo i2", \| ^~~~~~~~ Link: https://lore.kernel.org/r/20220526221258.GA409855@bhelgaas Fixes: 92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.15+
2022-05-31	Revert "PCI: brcmstb: Split brcm_pcie_setup() into two funcs"	Bjorn Helgaas
	This reverts commit 830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21. This is part of a revert of the following commits: 11ed8b8624b8 ("PCI: brcmstb: Do not turn off WOL regulators on suspend") 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice voltage regulators") 67211aadcb4b ("PCI: brcmstb: Add mechanism to turn on subdev regulators") 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs") Cyril reported that 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs"), which appeared in v5.17-rc1, broke booting on the Raspberry Pi Compute Module 4. Apparently 830aa6f29f07 panics with an Asynchronous SError Interrupt, and after further commits here is a black screen on HDMI and no output on the serial console. This does not seem to affect the Raspberry Pi 4 B. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925 Link: https://lore.kernel.org/r/20220511201856.808690-5-helgaas@kernel.org Reported-by: Cyril Brulebois <kibi@debian.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31	Revert "PCI: brcmstb: Add mechanism to turn on subdev regulators"	Bjorn Helgaas
	This reverts commit 67211aadcb4b968d0fdc57bc27240fa71500c2d4. This is part of a revert of the following commits: 11ed8b8624b8 ("PCI: brcmstb: Do not turn off WOL regulators on suspend") 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice voltage regulators") 67211aadcb4b ("PCI: brcmstb: Add mechanism to turn on subdev regulators") 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs") Cyril reported that 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs"), which appeared in v5.17-rc1, broke booting on the Raspberry Pi Compute Module 4. Apparently 830aa6f29f07 panics with an Asynchronous SError Interrupt, and after further commits here is a black screen on HDMI and no output on the serial console. This does not seem to affect the Raspberry Pi 4 B. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925 Link: https://lore.kernel.org/r/20220511201856.808690-4-helgaas@kernel.org Reported-by: Cyril Brulebois <kibi@debian.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31	Revert "PCI: brcmstb: Add control of subdevice voltage regulators"	Bjorn Helgaas
	This reverts commit 93e41f3fca3d4a0f927b784012338c37f80a8a80. This is part of a revert of the following commits: 11ed8b8624b8 ("PCI: brcmstb: Do not turn off WOL regulators on suspend") 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice voltage regulators") 67211aadcb4b ("PCI: brcmstb: Add mechanism to turn on subdev regulators") 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs") Cyril reported that 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs"), which appeared in v5.17-rc1, broke booting on the Raspberry Pi Compute Module 4. Apparently 830aa6f29f07 panics with an Asynchronous SError Interrupt, and after further commits here is a black screen on HDMI and no output on the serial console. This does not seem to affect the Raspberry Pi 4 B. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925 Link: https://lore.kernel.org/r/20220511201856.808690-3-helgaas@kernel.org Reported-by: Cyril Brulebois <kibi@debian.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31	Revert "PCI: brcmstb: Do not turn off WOL regulators on suspend"	Bjorn Helgaas
	This reverts commit 11ed8b8624b8085f706864b4addcd304b1e4fc38. This is part of a revert of the following commits: 11ed8b8624b8 ("PCI: brcmstb: Do not turn off WOL regulators on suspend") 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice voltage regulators") 67211aadcb4b ("PCI: brcmstb: Add mechanism to turn on subdev regulators") 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs") Cyril reported that 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs"), which appeared in v5.17-rc1, broke booting on the Raspberry Pi Compute Module 4. Apparently 830aa6f29f07 panics with an Asynchronous SError Interrupt, and after further commits here is a black screen on HDMI and no output on the serial console. This does not seem to affect the Raspberry Pi 4 B. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925 Link: https://lore.kernel.org/r/20220511201856.808690-2-helgaas@kernel.org Reported-by: Cyril Brulebois <kibi@debian.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31	dm table: fix dm_table_supports_poll to return false if no data devices	Mike Snitzer
	It was reported that the "generic/250" test in xfstests (which uses the dm-error target) demonstrates a regression where the kernel crashes in bioset_exit(). Since commit cfc97abcbe0b ("dm: conditionally enable BIOSET_PERCPU_CACHE for dm_io bioset") the bioset_init() for the dm_io bioset will setup the bioset's per-cpu alloc cache if all devices have QUEUE_FLAG_POLL set. But there was an bug where a target that doesn't have any data devices (and that doesn't even set the .iterate_devices dm target callback) will incorrectly return true from dm_table_supports_poll(). Fix this by updating dm_table_supports_poll() to follow dm-table.c's well-worn pattern for testing that _all_ targets in a DM table do in fact have underlying devices that set QUEUE_FLAG_POLL. NOTE: An additional block fix is still needed so that bio_alloc_cache_destroy() clears the bioset's ->cache member. Otherwise, a DM device's table reload that transitions the DM device's bioset from using a per-cpu alloc cache to _not_ using one will result in bioset_exit() crashing in bio_alloc_cache_destroy() because dm's dm_io bioset ("io_bs") was left with a stale ->cache member. Fixes: cfc97abcbe0b ("dm: conditionally enable BIOSET_PERCPU_CACHE for dm_io bioset") Reported-by: Matthew Wilcox <willy@infradead.org> Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-05-31	Merge tag 'iommu-updates-v5.19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Intel VT-d driver updates: - Domain force snooping improvement. - Cleanups, no intentional functional changes. - ARM SMMU driver updates: - Add new Qualcomm device-tree compatible strings - Add new Nvidia device-tree compatible string for Tegra234 - Fix UAF in SMMUv3 shared virtual addressing code - Force identity-mapped domains for users of ye olde SMMU legacy binding - Minor cleanups - Fix a BUG_ON in the vfio_iommu_group_notifier: - Groundwork for upcoming iommufd framework - Introduction of DMA ownership so that an entire IOMMU group is either controlled by the kernel or by user-space - MT8195 and MT8186 support in the Mediatek IOMMU driver - Make forcing of cache-coherent DMA more coherent between IOMMU drivers - Fixes for thunderbolt device DMA protection - Various smaller fixes and cleanups * tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits) iommu/amd: Increase timeout waiting for GA log enablement iommu/s390: Tolerate repeat attach_dev calls iommu/vt-d: Remove hard coding PGSNP bit in PASID entries iommu/vt-d: Remove domain_update_iommu_snooping() iommu/vt-d: Check domain force_snooping against attached devices iommu/vt-d: Block force-snoop domain attaching if no SC support iommu/vt-d: Size Page Request Queue to avoid overflow condition iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller iommu/vt-d: Change return type of dmar_insert_one_dev_info() iommu/vt-d: Remove unneeded validity check on dev iommu/dma: Explicitly sort PCI DMA windows iommu/dma: Fix iova map result check bug iommu/mediatek: Fix NULL pointer dereference when printing dev_name iommu: iommu_group_claim_dma_owner() must always assign a domain iommu/arm-smmu: Force identity domains for legacy binding iommu/arm-smmu: Support Tegra234 SMMU dt-bindings: arm-smmu: Add compatible for Tegra234 SOC dt-bindings: arm-smmu: Document nvidia,memory-controller property iommu/arm-smmu-qcom: Add SC8280XP support dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP ...
2022-05-31	vhost: rename vhost_work_dev_flush	Mike Christie
	This patch renames vhost_work_dev_flush to just vhost_dev_flush to relfect that it flushes everything on the device and that drivers don't know/care that polls are based on vhost_works. Drivers just flush the entire device and polls, and works for vhost-scsi management TMFs and IO net virtqueues, etc all are flushed. Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-9-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-test: drop flush after vhost_dev_cleanup	Mike Christie
	The flush after vhost_dev_cleanup is not needed because: 1. It doesn't do anything. vhost_dev_cleanup will stop the worker thread so the flush call will just return since the worker has not device. 2. It's not needed. The comment about jobs re-queueing themselves does not look correct because handle_vq does not requeue work. Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220517180850.198915-8-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-scsi: drop flush after vhost_dev_cleanup	Mike Christie
	The flush after vhost_dev_cleanup is not needed because: 1. It doesn't do anything. vhost_dev_cleanup will stop the worker thread so the flush call will just return since the worker has not device. 2. It's not needed for the re-queue case. vhost_scsi_evt_handle_kick grabs the mutex and if the backend is NULL will return without queueing a work. vhost_scsi_clear_endpoint will set the backend to NULL under the vq->mutex then drops the mutex and does a flush. So we know when vhost_scsi_clear_endpoint has dropped the mutex after clearing the backend no evt related work will be able to requeue. The flush would then make sure any queued evts are run and return. Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220517180850.198915-7-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost_vsock: simplify vhost_vsock_flush()	Andrey Ryabinin
	vhost_vsock_flush() calls vhost_work_dev_flush(vsock->vqs[i].poll.dev) before vhost_work_dev_flush(&vsock->dev). This seems pointless as vsock->vqs[i].poll.dev is the same as &vsock->dev and several flushes in a row doesn't do anything useful, one is just enough. Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220517180850.198915-6-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost_test: remove vhost_test_flush_vq()	Andrey Ryabinin
	vhost_test_flush_vq() just a simple wrapper around vhost_work_dev_flush() which seems have no value. It's just easier to call vhost_work_dev_flush() directly. Besides there is no point in obtaining vhost_dev pointer via 'n->vqs[index].poll.dev' while we can just use &n->dev. It's the same pointers, see vhost_test_open()/vhost_dev_init(). Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220517180850.198915-5-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31	vhost_net: get rid of vhost_net_flush_vq() and extra flush calls	Andrey Ryabinin
	vhost_net_flush_vq() calls vhost_work_dev_flush() twice passing vhost_dev pointer obtained via 'n->poll[index].dev' and 'n->vqs[index].vq.poll.dev'. This is actually the same pointer, initialized in vhost_net_open()/vhost_dev_init()/vhost_poll_init() Remove vhost_net_flush_vq() and call vhost_work_dev_flush() directly. Do the flushes only once instead of several flush calls in a row which seems rather useless. Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> [drop vhost_dev forward declaration in vhost.h] Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-4-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost: flush dev once during vhost_dev_stop	Mike Christie
	When vhost_work_dev_flush returns all work queued at that time will have completed. There is then no need to flush after every vhost_poll_stop call, and we can move the flush call to after the loop that stops the pollers. Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-3-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost: get rid of vhost_poll_flush() wrapper	Andrey Ryabinin
	vhost_poll_flush() is a simple wrapper around vhost_work_dev_flush(). It gives wrong impression that we are doing some work over vhost_poll, while in fact it flushes vhost_poll->dev. It only complicate understanding of the code and leads to mistakes like flushing the same vhost_dev several times in a row. Just remove vhost_poll_flush() and call vhost_work_dev_flush() directly. Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> [merge vhost_poll_flush removal from Stefano Garzarella] Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-2-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-vdpa: return -EFAULT on copy_to_user() failure	Dan Carpenter
	The copy_to_user() function returns the number of bytes remaining to be copied. However, we need to return a negative error code, -EFAULT, to the user. Fixes: 87f4c217413a ("vhost-vdpa: introduce uAPI to get the number of virtqueue groups") Fixes: e96ef636f154 ("vhost-vdpa: introduce uAPI to get the number of address spaces") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <YotG1vXKXXSayr63@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31	vdpasim: Off by one in vdpasim_set_group_asid()	Dan Carpenter
	The > comparison needs to be >= to prevent an out of bounds access of the vdpasim->iommu[] array. The vdpasim->iommu[] is allocated in vdpasim_create() and it has vdpasim->dev_attr.nas elements. Fixes: 87e5afeac247 ("vdpasim: control virtqueue support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <YotGQU1q224RKZR8@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31	virtio: Directly use ida_alloc()/free()	keliu
	Use ida_alloc()/ida_free() instead of deprecated ida_simple_get()/ida_simple_remove() . Signed-off-by: keliu <liuke94@huawei.com> Message-Id: <20220527073302.2474073-1-liuke94@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio: harden vring IRQ	Jason Wang
	This is a rework on the previous IRQ hardening that is done for virtio-pci where several drawbacks were found and were reverted: 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ that is used by some device such as virtio-blk 2) done only for PCI transport The vq->broken is re-used in this patch for implementing the IRQ hardening. The vq->broken is set to true during both initialization and reset. And the vq->broken is set to false in virtio_device_ready(). Then vring_interrupt() can check and return when vq->broken is true. And in this case, switch to return IRQ_NONE to let the interrupt core aware of such invalid interrupt to prevent IRQ storm. The reason of using a per queue variable instead of a per device one is that we may need it for per queue reset hardening in the future. Note that the hardening is only done for vring interrupt since the config interrupt hardening is already done in commit 22b7050a024d7 ("virtio: defer config changed notifications"). But the method that is used by config interrupt can't be reused by the vring interrupt handler because it uses spinlock to do the synchronization which is expensive. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220527060120.20964-9-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31	virtio: allow to unbreak virtqueue	Jason Wang
	This patch allows the new introduced __virtio_break_device() to unbreak the virtqueue. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220527060120.20964-8-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31	virtio-ccw: implement synchronize_cbs()	Jason Wang
	This patch tries to implement the synchronize_cbs() for ccw. For the vring_interrupt() that is called via virtio_airq_handler(), the synchronization is simply done via the airq_info's lock. For the vring_interrupt() that is called via virtio_ccw_int_handler(), a per device rwlock is introduced and used in the synchronization method. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220527060120.20964-7-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio-mmio: implement synchronize_cbs()	Jason Wang
	Simply synchronize the platform irq that is used by us. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220527060120.20964-6-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31	virtio-pci: implement synchronize_cbs()	Jason Wang
	We can simply reuse vp_synchronize_vectors() for .synchronize_cbs(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220527060120.20964-5-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31	virtio: use virtio_reset_device() when possible	Jason Wang
	This allows us to do common extension without duplicating code. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220527060120.20964-3-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2022-05-31	virtio: use virtio_device_ready() in virtio_device_restore()	Stefano Garzarella
	It will allow us to do extension on virtio_device_ready() without duplicating code. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Vineeth Vijayan <vneethv@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220527060120.20964-2-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31	vdpasim: allow to enable a vq repeatedly	Eugenio Pérez
	Code must be resilient to enable a queue many times. At the moment the queue is resetting so it's definitely not the expected behavior. v2: set vq->ready = 0 at disable. Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator") Cc: stable@vger.kernel.org Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220519145919.772896-1-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31	vDPA/ifcvf: fix uninitialized config_vector warning	Zhu Lingshan
	Static checkers are not informed that config_vector is controlled by vf->msix_vector_status, which can only be MSIX_VECTOR_SHARED_VQ_AND_CONFIG, MSIX_VECTOR_SHARED_VQ_AND_CONFIG and MSIX_VECTOR_DEV_SHARED. This commit uses an "if...elseif...else" code block to tell the checkers that it is a complete set, and config_vector can be initialized anyway Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <20220424072806.1083189-1-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31	vdpa/vp_vdpa : add vdpa tool support in vp_vdpa	Cindy Lu
	this patch is to add the support for vdpa tool in vp_vdpa here is the example steps modprobe vp_vdpa modprobe vhost_vdpa echo 0000:00:06.0>/sys/bus/pci/drivers/virtio-pci/unbind echo 1af4 1041 > /sys/bus/pci/drivers/vp-vdpa/new_id vdpa dev add name vdpa1 mgmtdev pci/0000:00:06.0 Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20220429091030.547434-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31	virtio: Replace long long int with long long	Solomon Tan
	This patch addresses the checkpatch.pl warning that long long is preferred over long long int. Signed-off-by: Solomon Tan <solomonbstoner@protonmail.ch> Message-Id: <YlzTUQa06sP94zxB@ArchDesktop> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio: Replace unsigned with unsigned int	Solomon Tan
	This patch addresses the checkpatch.pl warning where unsigned int is preferred over unsigned. Signed-off-by: Solomon Tan <solomonbstoner@protonmail.ch> Message-Id: <YlzS49Wo8JMDhKOt@ArchDesktop> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio-crypto: enable retry for virtio-crypto-dev	lei he
	Enable retry for virtio-crypto-dev, so that crypto-engine can process cipher-requests parallelly. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: lei he <helei.sig11@bytedance.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-6-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio-crypto: adjust dst_len at ops callback	lei he
	For some akcipher operations(eg, decryption of pkcs1pad(rsa)), the length of returned result maybe less than akcipher_req->dst_len, we need to recalculate the actual dst_len through the virt-queue protocol. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: lei he <helei.sig11@bytedance.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-5-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio-crypto: wait ctrl queue instead of busy polling	zhenwei pi
	Originally, after submitting request into virtio crypto control queue, the guest side polls the result from the virt queue. This works like following: CPU0 CPU1 ... CPUx CPUy \| \| \| \| \ \ / / \--------spin_lock(&vcrypto->ctrl_lock)-------/ \| virtqueue add & kick \| busy poll virtqueue \| spin_unlock(&vcrypto->ctrl_lock) ... There are two problems: 1, The queue depth is always 1, the performance of a virtio crypto device gets limited. Multi user processes share a single control queue, and hit spin lock race from control queue. Test on Intel Platinum 8260, a single worker gets ~35K/s create/close session operations, and 8 workers get ~40K/s operations with 800% CPU utilization. 2, The control request is supposed to get handled immediately, but in the current implementation of QEMU(v6.2), the vCPU thread kicks another thread to do this work, the latency also gets unstable. Tracking latency of virtio_crypto_alg_akcipher_close_session in 5s: usecs : count distribution 0 -> 1 : 0 \| \| 2 -> 3 : 7 \| \| 4 -> 7 : 72 \| \| 8 -> 15 : 186485 \|************************\| 16 -> 31 : 687 \| \| 32 -> 63 : 5 \| \| 64 -> 127 : 3 \| \| 128 -> 255 : 1 \| \| 256 -> 511 : 0 \| \| 512 -> 1023 : 0 \| \| 1024 -> 2047 : 0 \| \| 2048 -> 4095 : 0 \| \| 4096 -> 8191 : 0 \| \| 8192 -> 16383 : 2 \| \| This means that a CPU may hold vcrypto->ctrl_lock as long as 8192~16383us. To improve the performance of control queue, a request on control queue waits completion instead of busy polling to reduce lock racing, and gets completed by control queue callback. CPU0 CPU1 ... CPUx CPUy \| \| \| \| \ \ / / \--------spin_lock(&vcrypto->ctrl_lock)-------/ \| virtqueue add & kick \| ---------spin_unlock(&vcrypto->ctrl_lock)------ / / \ \ \| \| \| \| wait wait wait wait Test this patch, the guest side get ~200K/s operations with 300% CPU utilization. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-4-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio-crypto: use private buffer for control request	zhenwei pi
	Originally, all of the control requests share a single buffer( ctrl & input & ctrl_status fields in struct virtio_crypto), this allows queue depth 1 only, the performance of control queue gets limited by this design. In this patch, each request allocates request buffer dynamically, and free buffer after request, so the scope protected by ctrl_lock also get optimized here. It's possible to optimize control queue depth in the next step. A necessary comment is already in code, still describe it again: /* * Note: there are padding fields in request, clear them to zero before * sending to host to avoid to divulge any information. * Ex, virtio_crypto_ctrl_request::ctrl::u::destroy_session::padding[48] */ So use kzalloc to allocate buffer of struct virtio_crypto_ctrl_request. Potentially dereferencing uninitialized variables: Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-3-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio-crypto: change code style	zhenwei pi
	Use temporary variable to make code easy to read and maintain. /* Pad cipher's parameters */ vcrypto->ctrl.u.sym_create_session.op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER); vcrypto->ctrl.u.sym_create_session.u.cipher.para.algo = vcrypto->ctrl.header.algo; vcrypto->ctrl.u.sym_create_session.u.cipher.para.keylen = cpu_to_le32(keylen); vcrypto->ctrl.u.sym_create_session.u.cipher.para.op = cpu_to_le32(op); --> sym_create_session = &ctrl->u.sym_create_session; sym_create_session->op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER); sym_create_session->u.cipher.para.algo = ctrl->header.algo; sym_create_session->u.cipher.para.keylen = cpu_to_le32(keylen); sym_create_session->u.cipher.para.op = cpu_to_le32(op); The new style shows more obviously: - the variable we want to operate. - an assignment statement in a single line. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-2-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	virtio-pci: Remove wrong address verification in vp_del_vqs()	Murilo Opsfelder Araujo
	GCC 12 enhanced -Waddress when comparing array address to null [0], which warns: drivers/virtio/virtio_pci_common.c: In function ‘vp_del_vqs’: drivers/virtio/virtio_pci_common.c:257:29: warning: the comparison will always evaluate as ‘true’ for the pointer operand in ‘vp_dev->msix_affinity_masks + (sizetype)((long unsigned int)i * 256)’ must not be NULL [-Waddress] 257 \| if (vp_dev->msix_affinity_masks[i]) \| ^~~~~~ In fact, the verification is comparing the result of a pointer arithmetic, the address "msix_affinity_masks + i", which will always evaluate to true. Under the hood, free_cpumask_var() calls kfree(), which is safe to pass NULL, not requiring non-null verification. So remove the verification to make compiler happy (happy compiler, happy life). [0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102103 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Message-Id: <20220415023002.49805-1-muriloo@linux.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Christophe de Dinechin <dinechin@redhat.com>
2022-05-31	virtio: pci: Fix an error handling path in vp_modern_probe()	Christophe JAILLET
	If an error occurs after a successful pci_request_selected_regions() call, it should be undone by a corresponding pci_release_selected_regions() call, as already done in vp_modern_remove(). Fixes: fd502729fbbf ("virtio-pci: introduce modern device module") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Message-Id: <237109725aad2c3c03d14549f777b1927c84b045.1648977064.git.christophe.jaillet@wanadoo.fr> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vdpasim: control virtqueue support	Gautam Dawar
	This patch introduces the control virtqueue support for vDPA simulator. This is a requirement for supporting advanced features like multiqueue. A requirement for control virtqueue is to isolate its memory access from the rx/tx virtqueues. This is because when using vDPA device for VM, the control virqueue is not directly assigned to VM. Userspace (Qemu) will present a shadow control virtqueue to control for recording the device states. The isolation is done via the virtqueue groups and ASID support in vDPA through vhost-vdpa. The simulator is extended to have: 1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue) 2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1 contains CVQ 3) two address spaces and the simulator simply implements the address spaces by mapping it 1:1 to IOTLB. For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1 to group 1. So we have: 1) The IOTLB for virtqueue group 0 contains the mappings of guest, so RX and TX can be assigned to guest directly. 2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which is the buffers that allocated and managed by VMM only. So CVQ of vhost-vdpa is visible to VMM only. And Guest can not access the CVQ of vhost-vdpa. For the other use cases, since AS 0 is associated to all virtqueue groups by default. All virtqueues share the same mapping by default. To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is implemented in the simulator for the driver to set mac address. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-20-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vdpa_sim: filter destination mac address	Gautam Dawar
	This patch implements a simple unicast filter for vDPA simulator. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-19-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vdpa_sim: factor out buffer completion logic	Gautam Dawar
	Wrap up common buffer completion logic in to vdpasim_net_complete Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-18-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vdpa_sim: advertise VIRTIO_NET_F_MTU	Gautam Dawar
	We've already reported maximum mtu via config space, so let's advertise the feature. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-17-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-vdpa: support ASID based IOTLB API	Gautam Dawar
	This patch extends the vhost-vdpa to support ASID based IOTLB API. The vhost-vdpa device will allocated multiple IOTLBs for vDPA device that supports multiple address spaces. The IOTLBs and vDPA device memory mappings is determined and maintained through ASID. Note that we still don't support vDPA device with more than one address spaces that depends on platform IOMMU. This work will be done by moving the IOMMU logic from vhost-vDPA to vDPA device driver. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-16-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Includes fixup: vhost-vdpa: Fix some error handling path in vhost_vdpa_process_iotlb_msg() In the error paths introduced by the original patch, a mutex may be left locked. Add the correct goto instead of a direct return. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Message-Id: <89ef0ae4c26ac3cfa440c71e97e392dcb328ac1b.1653227924.git.christophe.jaillet@wanadoo.fr> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-vdpa: introduce uAPI to set group ASID	Gautam Dawar
	Follows the vDPA support for associating ASID to a specific virtqueue group. This patch adds a uAPI to support setting them from userspace. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-15-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-vdpa: uAPI to get virtqueue group id	Gautam Dawar
	Follows the support for virtqueue group in vDPA. This patches introduces uAPI to get the virtqueue group ID for a specific virtqueue in vhost-vdpa. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-14-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-vdpa: introduce uAPI to get the number of address spaces	Gautam Dawar
	This patch introduces the uAPI for getting the number of address spaces supported by this vDPA device. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-13-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31	vhost-vdpa: introduce uAPI to get the number of virtqueue groups	Gautam Dawar
	Follows the vDPA support for multiple address spaces, this patch introduce uAPI for the userspace to know the number of virtqueue groups supported by the vDPA device. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-12-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>