summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
2026-04-24Merge tag 'drm-next-2026-04-24' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm next fixes from Dave Airlie: "This is the first of two fixes for the merge PRs, the other is based on 7.0 branch. This mostly AMD fixes, a couple of weeks of backlog built up and this weeks. The main complaint I've seen is some boot warnings around the FP code handling which this should fix. Otherwise a single rcar-du and a single i915 fix. amdgpu: - SMU 14 fixes - Partition fixes - SMUIO 15.x fix - SR-IOV fixes - JPEG fix - PSP 15.x fix - NBIF fix - Devcoredump fixes - DPC fix - RAS fixes - Aldebaran smu fix - IP discovery fix - SDMA 7.1 fix - Runtime pm fix - MES 12.1 fix - DML2 fixes - DCN 4.2 fixes - YCbCr fixes - Freesync fixes - ISM fixes - Overlay cursor fix - DC FP fixes - UserQ locking fixes - DC idle state manager fix - ASPM fix - GPUVM SVM fix - DCE 6 fix amdkfd: - Fix memory clear handling - num_of_nodes bounds check fix i915: - Fix uninitialized variable in the alignment loop [psr] rcar-du: - fix NULL-ptr crash" * tag 'drm-next-2026-04-24' of https://gitlab.freedesktop.org/drm/kernel: (75 commits) drm/amdkfd: Add upper bound check for num_of_nodes drm: rcar-du: Fix crash when no CMM is available drm/amd/display: Disable 10-bit truncation and dithering on DCE 6.x drm/amdgpu: OR init_pte_flags into invalid leaf PTE updates drm/amd: Adjust ASPM support quirk to cover more Intel hosts drm/amd/display: Undo accidental fix revert in amdgpu_dm_ism.c drm/i915/psr: Init variable to avoid early exit from et alignment loop drm/amdgpu: drop userq fence driver refs out of fence process() drm/amdgpu/userq: unpin and unref doorbell and wptr outside mutex drm/amdgpu/userq: use pm_runtime_resume_and_get and fix err handling drm/amdgpu/userq: unmap_helper dont return the queue state drm/amdgpu/userq: unmap is to be called before freeing doorbell/wptr bo drm/amdgpu/userq: hold root bo lock in caller of input_va_validate drm/amdgpu/userq: caller to take reserv lock for vas_list_cleanup drm/amdgpu/userq: create_mqd does not need userq_mutex drm/amdgpu/userq: dont lock root bo with userq_mutex held drm/amdgpu/userq: fix kerneldoc for amdgpu_userq_ensure_ev_fence drm/amdgpu/userq: clean the VA mapping list for failed queue creation drm/amdgpu/userq: avoid uneccessary locking in amdgpu_userq_create drm/amd/display: Fix ISM teardown crash from NULL dc dereference ...
2026-04-21Merge tag 'drm-next-2026-04-22' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull more drm updates from Dave Airlie: "This is a followup which is mostly next material with some fixes. Alex pointed out I missed one of his AMD MRs from last week, so I added that, then Jani sent the pipe reordering stuff, otherwise it's just some minor i915 fixes and a dma-buf fix. drm: - Add support for AMD VSDB parsing to drm_edid dma-buf: - fix documentation formatting i915: - add support for reordered pipes to support joined pipes better - Fix VESA backlight possible check condition - Verify the correct plane DDB entry amdgpu: - Audio regression fix - Use drm edid parser for AMD VSDB - Misc cleanups - VCE cs parse fixes - VCN cs parse fixes - RAS fixes - Clean up and unify vram reservation handling - GPU Partition updates - system_wq cleanups - Add CONFIG_GCOV_PROFILE_AMDGPU kconfig option - SMU vram copy updates - SMU 13/14/15 fixes - UserQ fixes - Replace pasid idr with an xarray - Dither handling fix - Enable amdgpu by default for CIK APUs - Add IBs to devcoredump amdkfd: - system_wq cleanups radeon: - system_wq cleanups" * tag 'drm-next-2026-04-22' of https://gitlab.freedesktop.org/drm/kernel: (62 commits) drm/i915/display: change pipe allocation order for discrete platforms drm/i915/wm: Verify the correct plane DDB entry drm/i915/backlight: Fix VESA backlight possible check condition drm/i915: Walk crtcs in pipe order drm/i915/joiner: Make joiner "nomodeset" state copy independent of pipe order dma-buf: fix htmldocs error for dma_buf_attach_revocable drm/amdgpu: dump job ibs in the devcoredump drm/amdgpu: store ib info for devcoredump drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_fault drm/amdgpu: Use amdgpu by default for CIK APUs too drm/amd/display: Remove unused NUM_ELEMENTS macros drm/amd/display: Replace inline NUM_ELEMENTS macro with ARRAY_SIZE drm/amdgpu: save ring content before resetting the device drm/amdgpu: make userq fence_drv drop explicit in queue destroy drm/amdgpu: rework userq fence driver alloc/destroy drm/amdgpu/userq: use dma_fence_wait_timeout without test for signalled drm/amdgpu/userq: call dma_resv_wait_timeout without test for signalled drm/amdgpu/userq: add the return code too in error condition drm/amdgpu/userq: fence wait for max time in amdgpu_userq_wait_for_signal drm/amd/display: Change dither policy for 10 bpc output back to dithering ...
2026-04-21drm/amdgpu: OR init_pte_flags into invalid leaf PTE updatesSiwei He
Invalid leaf clears that only set AMDGPU_PTE_EXECUTABLE match the old GMC9 fault-priority workaround but omit adev->gmc.init_pte_flags. On GFX12 that includes AMDGPU_PTE_IS_PTE; without it, some cleared PTEs can fault as no-retry and bypass the SVM/XNACK handler when a VA is reused after a BO unmap. Apply init_pte_flags in amdgpu_vm_pte_update_flags() alongside EXECUTABLE so range-driven clears (e.g. amdgpu_vm_clear_freed) match amdgpu_vm_pt_clear() for leaf templates. Signed-off-by: Siwei He <siwei.he@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9d47b2c36b9a6c6b844c33cab407a5d7ad102234)
2026-04-21drm/amd: Adjust ASPM support quirk to cover more Intel hostsMario Limonciello
Some of the same issues identified in commit c770ef19673fb ("drm/amd/amdgpu: disable ASPM in some situations") also affect Tiger Lake systems with GFX11 connected over USB4. Widen the net to also match these hosts. Fixes: d9b3a066dfcd ("drm/amd: Exclude dGPUs in eGPU enclosures from DPM quirks") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5145 Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0a214d888485b9f35fe03882a92962e6d5697849)
2026-04-17drm/amdgpu: drop userq fence driver refs out of fence process()Prike Liang
amdgpu_userq_wait_ioctl() takes extra references on waited-on fence drivers and stores them in waitq->fence_drv_xa. When a new userq fence is created, those references are transferred into userq_fence->fence_drv_array so they can be released when the fence completes. However, those inherited references are currently only dropped from amdgpu_userq_fence_driver_process(). If a fence never reaches that path, such as it is already signaled when created, so we need to explicitly release those fences in that case. v2: use a list(list_cut_before) for managing the signal userq driver fences.(Christian) Link: https://patchwork.freedesktop.org/patch/718078/?series=164763&rev=2 v3: Doesn't cache the userq first unsignaled fence and use the cut before list head directly.(Christian) Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: unpin and unref doorbell and wptr outside mutexSunil Khatri
In amdgpu_userq_destroy once unmap_helpder is called within mutex there is no need to hold mutex. This helps in avoiding a deadlock between doorbell and wptr ww mutex and we could unpin and unref these bos outside mutex safely. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: use pm_runtime_resume_and_get and fix err handlingSunil Khatri
Use pm_runtime_resume_and_get instead of pm_runtime_get_sync as it return error but put the reference in the function itself. In goto statements we need to drop the pm reference too. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: unmap_helper dont return the queue stateSunil Khatri
We check for return value of amdgpu_userq_unmap_helper and compare it against the queue->state which is logically wrong and we should just check for failure and do the needfull. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: unmap is to be called before freeing doorbell/wptr boSunil Khatri
Unmap the queue after freeing doorbell and wptr memory is completely wrong. Any operation on the queue needs the doorbell and wptr to be valid and hence fixing the ordering. Also since we are using amdgpu_bo_reserve in non interruptrable mode so there is no need to check for its return values. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: hold root bo lock in caller of input_va_validateSunil Khatri
Caller should hold the reservation lock for root.bo in func amdgpu_userq_input_va_validate. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: caller to take reserv lock for vas_list_cleanupSunil Khatri
In function amdgpu_userq_buffer_vas_list_cleanup, remove the reservation lock for vm and caller should make sure it's taken before locking userq_mutex. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: create_mqd does not need userq_mutexSunil Khatri
Reshuffle the code to run create_mqd outside the mutex. code here is mostly setting up software structure init before actually registering the userqueue in the xa and to the driver. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: dont lock root bo with userq_mutex heldSunil Khatri
Do not hold reservation lock for root bo if userq_mutex is already held in the call flow this cause a lock issue with ttm_bo_delayed_delete. Its better to lock the vm->root.bo first and then go ahead with userq_mutex so userq_mutex threads dont get stuck until the reservation lock is held. In this case it helps in the function amdgpu_userq_buffer_vas_mapped for each queue during restore_all. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: fix kerneldoc for amdgpu_userq_ensure_ev_fenceSunil Khatri
Move the comment for the caller to the definition for amdgpu_userq_ensure_ev_fence in kerneldoc format. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: clean the VA mapping list for failed queue creationSunil Khatri
If the queue creation failed during mapping of the important VA's like queue_va, rptr_va and wptr_va. These needs to be cleaned as queue destroy will not be called for such queues as user never get call to creation failure. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/userq: avoid uneccessary locking in amdgpu_userq_createSunil Khatri
Reorganise code to avoid holding mutex userq_mutex while also trying to grab exec lock ww_mutex where its not needed for function amdgpu_userq_input_va_validate Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu: Clear cached EDID pointer after drm_edid_free()Srinivasan Shanmugam
The driver stores EDID in amdgpu_connector->edid and uses it as a cache. amdgpu_connector_get_edid() checks this pointer. If it is not NULL, it assumes EDID is already present and does not read it again. In some detect paths, the driver frees the EDID using drm_edid_free(), but does not set the pointer to NULL. Because of this, the pointer still looks valid even though the memory is already freed. Later, when amdgpu_connector_get_edid() is called, it returns early and does not read a new EDID. This can lead to using a freed pointer. Fix this by setting amdgpu_connector->edid = NULL after drm_edid_free(). This makes sure the driver reads a fresh EDID and does not use invalid memory. Fixes: 71036457ad85 ("drm/amdgpu/amdgpu_connectors: remove amdgpu_connector_free_edid") Reported-by: Dan Carpenter <error27@gmail.com> Cc: Joshua Peisach <jpeisach@ubuntu.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Joshua Peisach <jpeisach@ubuntu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/mes_v12_1: Fix iterator reuse in mes_v12_1_test_ring()Srinivasan Shanmugam
This code waits for the MES self-test to complete by repeatedly checking a register or memory value until it becomes valid or a timeout occurs. The fix ensures the timeout counter works correctly by not reusing the same variable inside another loop. mes_v12_1_test_ring() uses 'i' as the outer timeout loop counter, but reuses the same variable for the inner XCC scan in cooperative mode. This makes the timeout counter ambiguous and can lead to incorrect timeout handling. It also triggers a Smatch warning about reusing the outer loop iterator. Fix this by introducing a separate iterator for the inner XCC loop so that 'i' continues to represent only the timeout wait duration. drivers/gpu/drm/amd/amdgpu/mes_v12_1.c:2080 mes_v12_1_test_ring() warn: reusing outside iterator: 'i' drivers/gpu/drm/amd/amdgpu/mes_v12_1.c 2069 atomic64_set((atomic64_t *)wptr_cpu_addr, wptr); 2070 WDOORBELL64(doorbell_idx, wptr); 2071 2072 for (i = 0; i < adev->usec_timeout; i++) { i is counting usec 2073 if (queue_type == AMDGPU_RING_TYPE_SDMA) { 2074 tmp = le32_to_cpu(*cpu_ptr); 2075 } else { 2076 if (!adev->mes.enable_coop_mode) { 2077 tmp = RREG32_SOC15(GC, GET_INST(GC, xcc_id), 2078 regSCRATCH_REG0); 2079 } else { --> 2080 for (i = 0; i < num_xcc; i++) { and then re-used to count something else Fixes: 44e5195fa3d4 ("drm/amdgpu/mes_v12_1: add mes self test") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jack Xiao <Jack.Xiao@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu/sdma7.1: add support for disable_kqAlex Deucher
Plumb in support for disabling kernel queues and make it the default. For testing, kernel queues can be re-enabled by setting amdgpu.user_queue=0. Kernel queues are still created for use by the kernel driver for memory management, etc., just not user submissions. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu: fix IP discovery v0 handlingfilippor
Cyan skillfish uses IP discovery v0. This was broken when the IP discovery was refactored for newer versions. Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5189 Fixes: d0c647a6aae2 ("drm/amdgpu/discovery: support new discovery binary header") Signed-off-by: filippor <filippo.rossoni@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu: fix CPER ring header parsingXiang Liu
amdgpu_cper_ring_get_ent_sz() parses CPER headers directly from the circular ring buffer to determine the current entry size. When the ring is full and the write pointer lands near the end of the buffer, the header can wrap across the ring boundary. The existing code treats the 4-byte CPER signature as a C string and uses strcmp() on in-ring binary data, then reads record_length through a direct struct pointer cast. Both assumptions are unsafe for wrapped entries and can read past the end of the ring mapping. Fix the parser by comparing the signature as raw bytes and by copying the header into a local buffer before reading record_length, handling wraparound explicitly in both cases. This avoids out-of-bounds reads in amdgpu_cper_ring_get_ent_sz() when the CPER ring is full or the current entry starts at the tail of the ring. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu: fix heap buffer overflow in amdgpu_coredump ring dumpVitaly Prosyak
The off variable in the ring content dump loop tracks a byte offset accumulated from ring->ring_size (which is in bytes), but it is used as an index into u32 *rings_dw. C pointer arithmetic on a u32 pointer automatically multiplies the index by sizeof(u32) = 4, so the actual byte address accessed is: &rings_dw[off] == (char *)rings_dw + off * 4 This means off is effectively quadrupled, causing a 4x overshoot. Concrete example -- two rings, each ring_size = 8 192 bytes (8 KB): total_ring_size = 16 384 bytes rings_dw = kzalloc(16 384) /* 16 KB buffer */ Ring 0: off = 0 memcpy(&rings_dw[0], ring0->ring, 8192) -> writes bytes 0 .. 8 191 OK off += ring->ring_size -> off = 8 192 (BUG) Ring 1: off = 8 192 memcpy(&rings_dw[8192], ring1->ring, 8192) -> actual byte offset = 8 192 * 4 = 32 768 -> writes bytes 32 768 .. 40 959 -> but buffer is only 16 384 bytes! OVERFLOW With the fix (off += ring->ring_size / 4): Ring 0: off = 0 memcpy(&rings_dw[0], ring0->ring, 8192) OK off += 8 192 / 4 -> off = 2 048 Ring 1: off = 2 048 memcpy(&rings_dw[2048], ring1->ring, 8192) -> byte offset = 2 048 * 4 = 8 192 -> writes bytes 8 192 .. 16 383 OK KASAN catches the overflow as a slab-use-after-free when the write lands on a quarantined slab object: BUG: KASAN: slab-use-after-free in amdgpu_coredump+0x775/0x13c0 [amdgpu] Write of size 8192 at addr ffff8890b2400000 by task kworker/u128:1/329 Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched] Call Trace: __asan_memcpy+0x3c/0x60 amdgpu_coredump+0x775/0x13c0 [amdgpu] amdgpu_job_timedout+0xdb5/0x1420 [amdgpu] The corrupted object was a 4 KB drm_exec buffer from a completed amdgpu_cs_ioctl -- the ring dump memcpy overshot into this freed slab region. Fix by accumulating off in dword units (ring->ring_size / 4) so the u32* indexing produces the correct byte address. The reader in amdgpu_devcoredump_format() already consumes the stored offset as a dword index (rings_dw[off + j / 4]), so no change is needed there. Fixes: eea85914d15b ("drm/amdgpu: save ring content before resetting the device") Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu: correct single device PCIe reset flow for DPCCe Sun
For triggering the dpc event with a single device, we still need to set the in_link_reset flag and the dpc status. Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu: fix NULL pointer dereference in amdgpu_devcoredump_formatVitaly Prosyak
A race condition in the devcoredump code causes a NULL pointer dereference in amdgpu_devcoredump_format() when multiple GPU resets occur in quick succession. The sequence of events: 1. First reset calls amdgpu_coredump(), creates coredump1, sets adev->coredump = coredump1, and queues the deferred work. 2. The deferred work begins executing (work_pending() returns false since the work is now running, not just queued). 3. A second reset calls amdgpu_coredump(). work_pending() returns false because the work is running, so amdgpu_coredump() proceeds: creates coredump2, overwrites adev->coredump = coredump2, and re-queues the deferred work with queue_work(). 4. The first deferred work finishes and unconditionally sets adev->coredump = NULL, destroying the reference to coredump2. 5. The re-queued deferred work starts and reads adev->coredump = NULL. It then passes this NULL into amdgpu_devcoredump_format() which dereferences coredump->adev (offset 0 in the struct), triggering: KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:amdgpu_devcoredump_format+0xa6/0x36b0 [amdgpu] This was observed during the amd_deadlock IGT test where multiple subtests trigger rapid ring resets. The dmesg log shows four coredumps created within 120ms (at 102.377s, 104.424s, 104.492s, and 104.497s), with the crash occurring 13ms after the last one. Fix this with two changes: - Replace work_pending() with work_busy() in amdgpu_coredump() to also reject new coredumps while the deferred work is executing, not just when it is queued. This closes the main race window. - Add a defensive NULL check for adev->coredump at the start of amdgpu_devcoredump_deferred_work() to prevent the crash if the race still occurs (work_busy() is advisory, not a full barrier). v2: Drop the job->pasid NULL guard -- that fix was independently submitted and merged as commit 4c1f0a162da5 ("drm/amdgpu: add job->pasid in check as amdgpu_job could be NULL") by Sunil Khatri, reviewed by Christian König. Integrate with that patch as suggested by Christian. Fixes: 4bbba79a7f1d ("drm/amdgpu: move devcoredump generation to a worker") Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdgpu: add job->pasid in check as amdgpu_job could be NULLSunil Khatri
In below stack job->pasid is accessed while job is NULL. Access it within the check when job is non NULL. Failure call stack. [ 222.653622] BUG: kernel NULL pointer dereference, address: 000000000000014c [ 222.653625] #PF: supervisor read access in kernel mode [ 222.653628] #PF: error_code(0x0000) - not-present page [ 222.653630] PGD 0 P4D 0 [ 222.653635] Oops: Oops: 0000 [#1] SMP NOPTI [ 222.653639] CPU: 1 UID: 0 PID: 12 Comm: kworker/u96:0 Not tainted 6.19.0-amd-staging-drm-next #271 PREEMPT(voluntary) [ 222.653644] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F37c 05/12/2022 [ 222.653646] Workqueue: amdgpu-reset-dev amdgpu_userq_reset_work [amdgpu] [ 222.653961] RIP: 0010:amdgpu_coredump+0x8b/0x470 [amdgpu] [ 222.654158] Code: 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 c9 31 ff 31 d2 31 f6 45 31 c0 45 31 db e9 8c a9 1a e2 88 58 48 44 88 68 49 <41> 8b b7 4c 01 00 00 89 b0 80 00 00 00 4d 85 ff 48 89 45 d0 0f 84 [ 222.654161] RSP: 0018:ffffce68c0147c00 EFLAGS: 00010282 [ 222.654165] RAX: ffff8bc337407740 RBX: 0000000000000000 RCX: 0000000000000000 [ 222.654167] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 222.654170] RBP: ffffce68c0147c48 R08: 0000000000000000 R09: 0000000000000000 [ 222.654172] R10: ffff8bc337407740 R11: ffffffffc10dda10 R12: ffff8bc2d2e00000 [ 222.654174] R13: 0000000000000001 R14: ffff8bc2d2e5b368 R15: 0000000000000000 [ 222.654176] FS: 0000000000000000(0000) GS:ffff8bc64a5fe000(0000) knlGS:0000000000000000 [ 222.654179] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 222.654182] CR2: 000000000000014c CR3: 0000000135eca000 CR4: 0000000000350ef0 [ 222.654184] Call Trace: [ 222.654187] <TASK> [ 222.654190] ? amdgpu_ip_block_resume+0x28/0x70 [amdgpu] [ 222.654376] ? srso_return_thunk+0x5/0x5f [ 222.654382] amdgpu_device_reinit_after_reset+0x184/0x320 [amdgpu] [ 222.654552] amdgpu_do_asic_reset+0x129/0x160 [amdgpu] [ 222.654720] amdgpu_device_asic_reset+0x92/0x710 [amdgpu] [ 222.654890] amdgpu_device_gpu_recover+0x2ae/0x3d0 [amdgpu] [ 222.655060] amdgpu_userq_reset_work+0x76/0xa0 [amdgpu] [ 222.655229] process_scheduled_works+0x1f0/0x450 [ 222.655235] worker_thread+0x27f/0x370 Fixes: 32ab301b89b3 ("drm/amdgpu: store ib info for devcoredump") Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amdkfd: Clear VRAM on allocation to prevent stale data exposureAmir Shetaia
KFD VRAM allocations set AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE but not AMDGPU_GEM_CREATE_VRAM_CLEARED, leaving freshly allocated VRAM with stale data from prior use observable by compute kernels. The GEM ioctl path already sets VRAM_CLEARED for all userspace allocations via amdgpu_gem_create_ioctl() and amdgpu_mode_dumb_create(). The KFD path was missing this flag, allowing stale page table remnants to leak into user buffers. This causes crashes in RCCL P2P transport where non-zero data in ptrExchange/head/tail fields corrupts the protocol handshake. Signed-off-by: Amir Shetaia <Amir.Shetaia@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2026-04-17drm/amdgpu: Use NBIF offset for register RCC_STRAP0_RCC_DEV0_EPF0_STRAP0 .Ramalingeswara Reddy, Kanala
Define and use regRCC_STRAP0_RCC_DEV0_EPF0_STRAP0_nbif_4_10, to get correct rev_id in nbif_v6_3_1_get_rev_id(). Reviewed-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Ramalingeswara Reddy, Kanala <Kanala.RamalingeswaraReddy@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2026-04-17drm/amd: Add missing firmware declaration for PSP v15.0.0Mario Limonciello
PSP v15.0.0 needs both TOC and TA firmware. Without the declaration it won't get included in initramfs and leads to following failure: ``` Direct firmware load for amdgpu/psp_15_0_0_ta.bin failed with error -2 early_init of IP block <psp> failed -19 Fatal error during GPU init ``` Fixes: 9b24f63d825e7 ("drm/amdgpu: Enable support for PSP 15_0_0") Reviewed-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2026-04-17amdgpu/jpeg: fix deepsleep register for jpeg 5_0_0 and 5_0_2David (Ming Qiang) Wu
PCTL0__MMHUB_DEEPSLEEP_IB is 0x69004 on MMHUB 4,1,0 and and 0x60804 on MMHUB 4,2,0. 0x62a04 is on MMHUB 1,8,0/1. The DS bits are adjusted to cover more JPEG engines and MMHUB version. Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2026-04-17drm/amdgpu: gate VM CPU HDP flush on reset lockChenglei Xie
During GPU reset, the application could still run CPU page table updates. Each commit called amdgpu_device_flush_hdp(), which on SR-IOV sends work through the KIQ ring. That can advance sync_seq while the GPU is being reset, leaving fence writeback out of sync and causing amdgpu_fence_emit_polling() to time out on later KIQ use. Fix: amdgpu_vm_cpu_commit(): Reset will flush HDP anyway, the HDP flush in amdgpu_vm_cpu_commit() can be skipped when a reset is ongoging. Take reset_domain->sem with down_read_trylock() before amdgpu_device_flush_hdp(). If the reset path holds the write lock, skip the HDP flush so no HDP-related HW access (including KIQ) runs during reset; state is re-established after reset. Signed-off-by: Chenglei Xie <Chenglei.Xie@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2026-04-17drm/amdgpu: Use SMUIO 15.0.0 offsets for TSC upper and lower count.Ramalingeswara Reddy, Kanala
Define and use regGOLDEN_TSC_COUNT_UPPER_smu_15_0_0 and regGOLDEN_TSC_COUNT_LOWER_smu_15_0_0 for TSC upper and lower count. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Ramalingeswara Reddy, Kanala <Kanala.RamalingeswaraReddy@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2026-04-17drm/amdgpu: Remove sys file compute_partition_mem_alloc_mode at module unloadXiaogang Chen
Module reload would fail when create sys file that was not removed during module unload. Fixes: e0e9792ea2d4 ("drm/amdgpu: add an option to allow gpu partition allocate all available memory") Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-15Merge tag 'drm-next-2026-04-15' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm updates from Dave Airlie: "Highlights: - new DRM RAS infrastructure using netlink - amdgpu: enable DC on CIK APUs, and more IP enablement, and more user queue work - xe: purgeable BO support, and new hw enablement - dma-buf : add revocable operations Full summary: mm: - two-pass MMU interval notifiers - add gpu active/reclaim per-node stat counters math: - provide __KERNEL_DIV_ROUND_CLOSEST() in UAPI - implement DIV_ROUND_CLOSEST() with __KERNEL_DIV_ROUND_CLOSEST() rust: - shared tag with driver-core: register macro and io infra - core: rework DMA coherent API - core: add interop::list to interop with C linked lists - core: add more num::Bounded operations - core: enable generic_arg_infer and add EMSGSIZE - workqueue: add ARef<T> support for work and delayed work - add GPU buddy allocator abstraction - add DRM shmem GEM helper abstraction - allow drm:::Device to dispatch work and delayed work items to driver private data - add dma_resv_lock helper and raw accessors core: - introduce DRM RAS infrastructure over netlink - add connector panel_type property - fourcc: add ARM interleaved 64k modifier - colorop: add destroy helper - suballoc: split into alloc and init helpers - mode: provide DRM_ARGB_GET*() macros for reading color components edid: - provide drm_output_color_Format dma-buf: - provide revoke mechanism for shared buffers - rename move_notify to invalidate_mappings - always enable move_notify - protect dma_fence_ops with RCU and improve locking - clean pages with helpers atomic: - allocate drm_private_state via callback - helper: use system_percpu_wq buddy: - make buddy allocator available to gpu level - add kernel-doc for buddy allocator - improve aligned allocation ttm: - fix fence signalling - improve tests and docs - improve handling of gfp_retry_mayfail - use per-node stat counters to track memory allocations - port pool to use list_lru - drop NUMA specific pools - make pool shrinker numa aware - track allocated pages per numa node coreboot: - cleanup coreboot framebuffer support sched: - fix race condition in drm_sched_fini pagemap: - enable THP support - pass pagemap_addr by reference gem-shmem: - Track page accessed/dirty status across mmap/vmap gpusvm: - reenable device to device migration - fix unbalanced unclock bridge: - anx7625: Support USB-C plus DT bindings - connector: Fix EDID detection - dw-hdmi-qp: Support Vendor-Specfic and SDP Infoframes; improve others - fsl-ldb: Fix visual artifacts plus related DT property 'enable-termination-resistor' - imx8qxp-pixel-link: Improve bridge reference handling - lt9611: Support Port-B-only input plus DT bindings - tda998x: Support DRM_BRIDGE_ATTACH_NO_CONNECTOR; Clean up - Support TH1520 HDMI plus DT bindings - waveshare-dsi: Fix register and attach; Support 1..4 DSI lanes plus DT bindings - anx7625: Fix USB Type-C handling - cdns-mhdp8546-core: Handle HDCP state in bridge atomic_check - Support Lontium LT8713SX DP MST bridge plus DT bindings - analogix_dp: Use DP helpers for link training panel: - panel-jdi-lt070me05000: Use mipi-dsi multi functions - panel-edp: Support Add AUO B116XAT04.1 (HW: 1A); Support CMN N116BCL-EAK (C2); Support FriendlyELEC plus DT changes - panel-edp: Fix timings for BOE NV140WUM-N64 - ilitek-ili9882t: Allow GPIO calls to sleep - jadard: Support TAIGUAN XTI05101-01A - lxd: Support LXD M9189A plus DT bindings - mantix: Fix pixel clock; Clean up - motorola: Support Motorola Atrix 4G and Droid X2 plus DT bindings - novatek: Support Novatek/Tianma NT37700F plus DT bindings - simple: Support EDT ET057023UDBA plus DT bindings; Support Powertip PH800480T032-ZHC19 plus DT bindings; Support Waveshare 13.3" - novatek-nt36672a: Use mipi_dsi_*_multi() functions - panel-edp: Support BOE NV153WUM-N42, CMN N153JCA-ELK, CSW MNF307QS3-2 - support Himax HX83121A plus DT bindings - support JuTouch JT070TM041 plus DT bindings - support Samsung S6E8FC0 plus DT bindings - himax-hx83102c: support Samsung S6E8FC0 plus DT bindings; support backlight - ili9806e: support Rocktech RK050HR345-CT106A plus DT bindings - simple: support Tianma TM050RDH03 plus DT bindings amdgpu: - enable DC by default on CIK APUs - userq fence ioctl param size fixes - set panel_type to OLED for eDP - refactor DC i2c code - FAMS2 update - rework ttm handling to allow multiple engines - DC DCE 6.x cleanup - DC support for NUTMEG/TRAVIS DP bridge - DCN 4.2 support - GC12 idle power fix for compute - use struct drm_edid in non-DC code - enable NV12/P010 support on primary planes - support newer IP discovery tables - VCN/JPEG 5.0.2 support - GC/MES 12.1 updates - USERQ fixes - add DC idle state manager - eDP DSC seamless boot amdkfd: - GC 12.1 updates - non 4K page fixes xe: - basic Xe3p_LPG and NVL-P enabling patches - allow VM_BIND decompress support - add purgeable buffer object support - add xe_vm_get_property_ioctl - restrict multi-lrc to VCS/VECS engines - allow disabling VM overcommit in fault mode - dGPU memory optimizations - Workaround cleanups and simplification - Allow VFs VRAM quote changes using sysfs - convert GT stats to per-cpu counters - pagefault refactors - enable multi-queue on xe3p_xpc - disable DCC on PTL - make MMIO communication more robust - disable D3Cold for BMG on specific platforms - vfio: improve FLR sync for Xe VFIO i915/display: - C10/C20/LT PHY PLL divider verification - use trans push mechanism to generate PSR frame change on LNL+ - refactor DP DSC slice config - VGA decode refactoring - refactor DPT, gen2-4 overlay, masked field register macro helpers - refactor stolen memory allocation decisions - prepare for UHBR DP tunnels - refactor LT PHY PLL to use DPLL framework - implement register polling/waiting in display code - add shared stepping header between i915 and display i915: - fix potential overflow of shmem scatterlist length nouveau: - provide Z cull info to userspace - initial GA100 support - shutdown on PCI device shutdown nova-core: - harden GSP command queue - add support for large RPCs - simplify GSP sequencer and message handling - refactor falcon firmware handling - convert to new register macro - conver to new DMA coherent API - use checked arithmetic - add debugfs support for gsp-rm log buffers - fix aux device registration for multi-GPU msm: - CI: - Uprev mesa - Restore CI jobs for Qualcomm APQ8016 and APQ8096 devices - Core: - Switched to of_get_available_child_by_name() - DPU: - Fixes for DSC panels - Fixed brownout because of the frequency / OPP mismatch - Quad pipe preparation (not enabled yet) - Switched to virtual planes by default - Dropped VBIF_NRT support - Added support for Eliza platform - Reworked alpha handling - Switched to correct CWB definitions on Eliza - Dropped dummy INTF_0 on MSM8953 - Corrected INTFs related to DP-MST - DP: - Removed debug prints looking into PHY internals - DSI: - Fixes for DSC panels - RGB101010 support - Support for SC8280XP - Moved PHY bindings from display/ to phy/ - GPU: - Preemption support for x2-85 and a840 - IFPC support for a840 - SKU detection support for x2-85 and a840 - Expose AQE support (VK ray-pipeline) - Avoid locking in VM_BIND fence signaling path - Fix to avoid reclaim in GPU snapshot path - Disallow foreign mapping of _NO_SHARE BOs - HDMI: - Fixed infoframes programming - MDP5: - Dropped support for MSM8974v1 - Dropped now unused code for MSM8974 v1 and SDM660 / MSM8998 panthor: - add tracepoints for power and IRQs - fix fence handling - extend timestamp query with flags - support various sources for timestamp queries tyr: - fix names and model/versions rockchip: - vop2: use drm logging function - rk3576 displayport support - support CRTC background color atmel-hlcdc: - support sana5d65 LCD controller tilcdc: - use DT bindings schema - use managed DRM interfaces - support DRM_BRIDGE_ATTACH_NO_CONNECTOR verisilicon: - support DC8200 + DT bindings virtgpu: - support PRIME import with 3D enabled komeda: - fix integer overflow in AFBC checks mcde: - improve bridge handling gma500: - use drm client buffer for fbdev framebuffer amdxdna: - add sensors ioctls - provide NPU power estimate - support column utilization sensor - allow forcing DMA through IOMMU IOVA - support per-BO mem usage queries - refactor GEM implementation ivpu: - update boot API to v3.29.4 - limit per-user number of doorbells/contexts - perform engine reset on TDR error loongson: - replace custom code with drm_gem_ttm_dumb_map_offset() imx: - support planes behind the primary plane - fix bus-format selection vkms: - support CRTC background color v3d: - improve handling of struct v3d_stats komeda: - support Arm China Linlon D6 plus DT bindings imagination: - improve power-off sequence - support context-reset notification from firmware mediatek: - mtk_dsi: enable hs clock during pre-enable - Remove all conflicting aperture devices during probe - Add support for mt8167 display blocks" * tag 'drm-next-2026-04-15' of https://gitlab.freedesktop.org/drm/kernel: (1735 commits) drm/ttm/tests: Remove checks from ttm_pool_free_no_dma_alloc drm/ttm/tests: fix lru_count ASSERT drm/vram: remove DRM_VRAM_MM_FILE_OPERATIONS from docs drm/fb-helper: Fix a locking bug in an error path dma-fence: correct kernel-doc function parameter @flags ttm/pool: track allocated_pages per numa node. ttm/pool: make pool shrinker NUMA aware (v2) ttm/pool: drop numa specific pools ttm/pool: port to list_lru. (v2) drm/ttm: use gpu mm stats to track gpu memory allocations. (v4) mm: add gpu active/reclaim per-node stat counters (v2) gpu: nova-core: fix missing colon in SEC2 boot debug message gpu: nova-core: vbios: use from_le_bytes() for PCI ROM header parsing gpu: nova-core: bitfield: fix broken Default implementation gpu: nova-core: falcon: pad firmware DMA object size to required block alignment gpu: nova-core: gsp: fix undefined behavior in command queue code drm/shmem_helper: Make sure PMD entries get the writeable upgrade accel/ivpu: Trigger recovery on TDR with OS scheduling drm/msm: Use of_get_available_child_by_name() dt-bindings: display/msm: move DSI PHY bindings to phy/ subdir ...
2026-04-13Merge tag 'vfs-7.1-rc1.kino' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs i_ino updates from Christian Brauner: "For historical reasons, the inode->i_ino field is an unsigned long, which means that it's 32 bits on 32 bit architectures. This has caused a number of filesystems to implement hacks to hash a 64-bit identifier into a 32-bit field, and deprives us of a universal identifier field for an inode. This changes the inode->i_ino field from an unsigned long to a u64. This shouldn't make any material difference on 64-bit hosts, but 32-bit hosts will see struct inode grow by at least 4 bytes. This could have effects on slabcache sizes and field alignment. The bulk of the changes are to format strings and tracepoints, since the kernel itself doesn't care that much about the i_ino field. The first patch changes some vfs function arguments, so check that one out carefully. With this change, we may be able to shrink some inode structures. For instance, struct nfs_inode has a fileid field that holds the 64-bit inode number. With this set of changes, that field could be eliminated. I'd rather leave that sort of cleanups for later just to keep this simple" * tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group() EVM: add comment describing why ino field is still unsigned long vfs: remove externs from fs.h on functions modified by i_ino widening treewide: fix missed i_ino format specifier conversions ext4: fix signed format specifier in ext4_load_inode trace event treewide: change inode->i_ino from unsigned long to u64 nilfs2: widen trace event i_ino fields to u64 f2fs: widen trace event i_ino fields to u64 ext4: widen trace event i_ino fields to u64 zonefs: widen trace event i_ino fields to u64 hugetlbfs: widen trace event i_ino fields to u64 ext2: widen trace event i_ino fields to u64 cachefiles: widen trace event i_ino fields to u64 vfs: widen trace event i_ino fields to u64 net: change sock.sk_ino and sock_i_ino() to u64 audit: widen ino fields to u64 vfs: widen inode hash/lookup functions to u64
2026-04-03drm/amdgpu: dump job ibs in the devcoredumpPierre-Eric Pelloux-Prayer
Now that we have a worker thread, we can try to access the IBs of the job. The process is: * get the VM from the PASID * get the BO from its VA and the VM * map the BO for CPU access * copy everything, then add it to the dump Each step can fail so we have to be cautious. These operations can be slow so when amdgpu_devcoredump_format is called only to determine the size of the buffer we skip all of them and assume they will succeed. --- v3: use kvfree --- Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: store ib info for devcoredumpPierre-Eric Pelloux-Prayer
Store the basic state of IBs so we can read it back in the amdgpu_devcoredump_format function. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_faultPierre-Eric Pelloux-Prayer
This is tricky to implement right and we're going to need it from the devcoredump. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: Use amdgpu by default for CIK APUs tooTimur Kristóf
CIK APUs are: Kaveri, Kabini and Mullins from 2013~2015, which all have a second generation GCN based integrated GPU. The amdgpu driver has been working well on CIK APUs for years. Features which were previously missing have been added recently, specifically DC support for analog connectors and DP bridge encoders. Now amdgpu is at feature parity with the old radeon driver on CIK APUs. Enabling the amdgpu driver by default for CIK APUs has the following benefits: - More stable OpenGL support through RadeonSI - Vulkan support through RADV - Improved performance - Better display features through DC Users who want to keep using the old driver can do so using: amdgpu.cik_support=0 radeon.cik_support=1 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: save ring content before resetting the devicePierre-Eric Pelloux-Prayer
Otherwise the content might not be relevant. When a coredump is generated the rings with outstanding fences are saved and then printed to the final devcoredump from the worker thread. Since this requires memory allocation, the ring capture might be missing from the generated devcoredump. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: make userq fence_drv drop explicit in queue destroyPrike Liang
amdgpu_userq_fence_driver_free() is now responsible only for releasing per-queue ancillary state (last_fence, fence_drv_xa) and no longer touches the ownership reference, making each function's contract clear. v2: Get the userq fence driver from amdgpu_userq_fence_driver_alloc() directly and dropping the userq fence driver reference after removing userq_doorbell_xa entry.(Christian) Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: rework userq fence driver alloc/destroyPrike Liang
The correct fix is to tie the global xa entry lifetime to the queue lifetime: insert in amdgpu_userq_create() and erase in amdgpu_userq_cleanup(), both at the well-defined doorbell_index key, making the operation O(1) and resolve the fence driver UAF problem by binding the userq driver fence to per queue. v2: clean up the local variables initialization. (Christian) Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu/userq: use dma_fence_wait_timeout without test for signalledSunil Khatri
In function amdgpu_userq_wait_for_last_fence use dma_fence_wait to wait infinitely. Also there is no need to print error as we wont be timing out anymore. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu/userq: call dma_resv_wait_timeout without test for signalledSunil Khatri
In function amdgpu_userq_gem_va_unmap_validate call dma_resv_wait_timeout directly. Also since we are waiting forever we should not be having any return value and hence no handling needed. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu/userq: add the return code too in error conditionSunil Khatri
In function amdgpu_userq_restore a. amdgpu_userq_vm_validate: add return code in error condition b. amdgpu_userq_restore_all: It already prints the error log, just update the erorr log in the function and remove it from caller. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu/userq: fence wait for max time in amdgpu_userq_wait_for_signalSunil Khatri
wait for infinite time for fences in function amdgpu_userq_wait_for_signal and for that use dma_fence_wait(f, false); Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: replace PASID IDR with XArrayMikhail Gavrilov
Replace the PASID IDR + spinlock with XArray as noted in the TODO left by commit ea56aa262570 ("drm/amdgpu: fix the idr allocation flags"). The IDR conversion still has an IRQ safety issue: amdgpu_pasid_free() can be called from hardirq context via the fence signal path, but amdgpu_pasid_idr_lock is taken with plain spin_lock() in process context, creating a potential deadlock: CPU0 ---- spin_lock(&amdgpu_pasid_idr_lock) // process context, IRQs on <Interrupt> spin_lock(&amdgpu_pasid_idr_lock) // deadlock The hardirq call chain is: sdma_v6_0_process_trap_irq -> amdgpu_fence_process -> dma_fence_signal -> drm_sched_job_done -> dma_fence_signal -> amdgpu_pasid_free_cb -> amdgpu_pasid_free Use XArray with XA_FLAGS_LOCK_IRQ (all xa operations use IRQ-safe locking internally) and XA_FLAGS_ALLOC1 (zero is not a valid PASID). Both xa_alloc_cyclic() and xa_erase() then handle locking consistently, fixing the IRQ safety issue and removing the need for an explicit spinlock. v8: squash in irq safe fix Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Fixes: ea56aa262570 ("drm/amdgpu: fix the idr allocation flags") Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case") Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu/userq: dont need check for return values in amdgpu_userq_evictSunil Khatri
Function of amdgpu_userq_evict function do not need to check for return values as we dont use them and no need to log errors as we are already logging in called functions. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: replace use of system_wq with system_dfl_wqMarco Crivellari
This patch continues the effort to refactor workqueue APIs, which has begun with the changes introducing new workqueues and a new alloc_workqueue flag: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") The point of the refactoring is to eventually alter the default behavior of workqueues to become unbound by default so that their workload placement is optimized by the scheduler. Before that to happen after a careful review and conversion of each individual case, workqueue users must be converted to the better named new workqueues with no intended behaviour changes: system_wq -> system_percpu_wq system_unbound_wq -> system_dfl_wq This way the old obsolete workqueues (system_wq, system_unbound_wq) can be removed in the future. Suggested-by: Tejun Heo <tj@kernel.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: replace use of system_unbound_wq with system_dfl_wqMarco Crivellari
This patch continues the effort to refactor workqueue APIs, which has begun with the changes introducing new workqueues and a new alloc_workqueue flag: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") The point of the refactoring is to eventually alter the default behavior of workqueues to become unbound by default so that their workload placement is optimized by the scheduler. Before that to happen after a careful review and conversion of each individual case, workqueue users must be converted to the better named new workqueues with no intended behaviour changes: system_wq -> system_percpu_wq system_unbound_wq -> system_dfl_wq This way the old obsolete workqueues (system_wq, system_unbound_wq) can be removed in the future. Suggested-by: Tejun Heo <tj@kernel.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amdgpu: add CONFIG_GCOV_PROFILE_AMDGPU Kconfig optionVitaly Prosyak
Add a Kconfig option to enable GCOV code coverage profiling for the amdgpu driver, following the established upstream pattern used by CONFIG_GCOV_PROFILE_FTRACE (kernel/trace), CONFIG_GCOV_PROFILE_RDS (net/rds), and CONFIG_GCOV_PROFILE_URING (io_uring). This allows CI systems to enable amdgpu code coverage entirely via .config (e.g., scripts/config --enable GCOV_PROFILE_AMDGPU) without manually editing the amdgpu Makefile. The option depends on both DRM_AMDGPU and GCOV_KERNEL, defaults to n, and is therefore never enabled in production or distro builds. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>