<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-next.git/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c, branch master</title>
<subtitle>Linux kernel latest source</subtitle>
<id>http://mirrors.hust.edu.cn/git/linux-next.git/atom?h=master</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/linux-next.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/'/>
<updated>2026-07-01T15:56:47+00:00</updated>
<entry>
<title>drm/amdgpu: Do not fiddle with the idle workers too much</title>
<updated>2026-07-01T15:56:47+00:00</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2026-06-26T08:55:58+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=50be7c9b5d5ea55fd40bb411cf324cec99ec7417'/>
<id>urn:sha1:50be7c9b5d5ea55fd40bb411cf324cec99ec7417</id>
<content type='text'>
Idle workers only need to be canceled or pushed back if we are potentially
idle. Make the both operations conditional on the pre-increment and post-
decrement status of the in-flight job counter.

Reviewed-by: Timur Kristóf &lt;timur.kristof@gmail.com&gt;
Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Timur Kristóf &lt;timur.kristof@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix false error return to non-KCQ</title>
<updated>2026-07-01T15:41:10+00:00</updated>
<author>
<name>Amber Lin</name>
<email>amber.lin@amd.com</email>
</author>
<published>2026-06-26T03:09:10+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=9d5f1c0db1d37db24bb9556dd1e433eb30fbd3b6'/>
<id>urn:sha1:9d5f1c0db1d37db24bb9556dd1e433eb30fbd3b6</id>
<content type='text'>
amdgpu_gfx_reset_mes_compute is used to coordinate suspend_all, reset,
and resume_all between KCQ and compute user queues. When a hung queue
comes from the compute user queues and the reset is successful, the KCQ
failure after reset should be sent to KCQ only and not the compute user
queues. Compute user queues can operate after a successful reset without
a mode reset.

Fixes: a4e4d945cba8 ("drm/amdgpu/gfx: defer per-queue helper_end until after MES resume")
Signed-off-by: Amber Lin &lt;amber.lin@amd.com&gt;
Acked-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Revert "drm/amdgpu: defer KCQ remap until after MES resume in reset flow"</title>
<updated>2026-07-01T15:30:04+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>Jesse.Zhang@amd.com</email>
</author>
<published>2026-06-25T05:28:54+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=808481e5fb8fff13fc8890c259b0d16cf363328d'/>
<id>urn:sha1:808481e5fb8fff13fc8890c259b0d16cf363328d</id>
<content type='text'>
This reverts commit 36b6c723d82c07dbbeae95d5883d4ecf0a643727.

It introduced a regression on gfx11: the kfd negative test failed.

Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Reviewed-by: Amber Lin &lt;Amber.Lin@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: defer KCQ remap until after MES resume in reset flow</title>
<updated>2026-07-01T15:15:47+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>Jesse.Zhang@amd.com</email>
</author>
<published>2026-06-20T15:06:35+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=36b6c723d82c07dbbeae95d5883d4ecf0a643727'/>
<id>urn:sha1:36b6c723d82c07dbbeae95d5883d4ecf0a643727</id>
<content type='text'>
Split amdgpu_gfx_mes_reset_queue_start() into reset+unmap now and queue
reinit later, and do the remap only after amdgpu_mes_resume(). Avoids
re-adding legacy queues while MES gangs are still suspended.

Suggested-by: Shaoyun Liu &lt;shaoyun.liu@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix mes remove_hw_queue lock</title>
<updated>2026-07-01T15:07:40+00:00</updated>
<author>
<name>Amber Lin</name>
<email>Amber.Lin@amd.com</email>
</author>
<published>2026-06-17T17:15:55+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=d077a0d57c6d151866c4914e7890b6117d255c61'/>
<id>urn:sha1:d077a0d57c6d151866c4914e7890b6117d255c61</id>
<content type='text'>
down_read/up_read adev-&gt;reset_domain semaphore should be placed around
remove queue.

v2: remove the empty function, recover_bad_queue_mes to avoid compile
error on rhel

Fixes: f401a2633e02 ("drm/amdgpu: Remove faulty queue before resume")
Signed-off-by: Amber Lin &lt;Amber.Lin@amd.com&gt;
Reviewed-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/gfx: fix cleaner shader IB buffer overflow</title>
<updated>2026-06-17T20:13:02+00:00</updated>
<author>
<name>Asad Kamal</name>
<email>asad.kamal@amd.com</email>
</author>
<published>2026-06-05T15:44:08+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=bf21af331ebf72d0935fd70c73192414a422c03a'/>
<id>urn:sha1:bf21af331ebf72d0935fd70c73192414a422c03a</id>
<content type='text'>
The cleaner shader sysfs path allocates a 16-dword (64 byte) IB but
incorrectly fills (align_mask + 1) dwords. On GFX rings align_mask is
0xff, so the loop wrote 256 dwords into a 64-byte buffer, causing a
kernel page fault.

The IB only needs to be a minimal NOP shell to schedule the job; the
cleaner shader itself is emitted on the ring via emit_cleaner_shader().
Fill 16 dwords to match the allocation.

v2: Use ib_size_dw variable (Lijo)

Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner shader")
Suggested-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Asad Kamal &lt;asad.kamal@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/gfx: defer per-queue helper_end until after MES resume</title>
<updated>2026-06-17T20:11:54+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>Jesse.Zhang@amd.com</email>
</author>
<published>2026-06-05T08:28:47+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=a4e4d945cba8a2fdbe2d964d37eba1f5b5c51365'/>
<id>urn:sha1:a4e4d945cba8a2fdbe2d964d37eba1f5b5c51365</id>
<content type='text'>
amdgpu_gfx_reset_mes_compute() runs amdgpu_mes_suspend(adev, 0) to
quiesce all gangs, resets the offending queue(s), then resumes. The
existing amdgpu_gfx_mes_reset_queue() called amdgpu_ring_reset_helper_end()
right after unmap/restore/map of the reset queue, which re-emits backed-up
commands and rings the doorbell. That doorbell hits a still-suspended CP:
on the subsequent resume the queue partially wedges -- the first new IB
after the reset may execute but later submissions stall, which surfaces
as repeated timeouts on the same ring under concurrent workloads.

Split out amdgpu_gfx_mes_reset_queue_start() (backup + MES reset +
unmap/restore/map only) and defer helper_end. amdgpu_gfx_reset_mes_compute()
collects the (ring, fence) pair for every queue it resets and runs
helper_end on each after amdgpu_mes_resume(), so the re-emit doorbells
land on a running CP. amdgpu_gfx_reset_mes_kcq() now reports the matched
ring/fence back to the caller for the same reason.

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Remove faulty queue before resume</title>
<updated>2026-06-17T19:51:36+00:00</updated>
<author>
<name>Amber Lin</name>
<email>Amber.Lin@amd.com</email>
</author>
<published>2026-05-29T19:36:52+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=f401a2633e0243a3ea2f42a0b2806bf62057cb3d'/>
<id>urn:sha1:f401a2633e0243a3ea2f42a0b2806bf62057cb3d</id>
<content type='text'>
When driver already knows a bad queue but MES suspend_all is successful
and MES hung queue detection doesn't detect it, remove this queue refore
resume_all.

Signed-off-by: Amber Lin &lt;Amber.Lin@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/gfx: add a common helper to handle MES compute resets</title>
<updated>2026-06-17T19:51:35+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2026-05-07T16:03:47+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=e49044061b37cc4be99bfd17f6ccdd3509300469'/>
<id>urn:sha1:e49044061b37cc4be99bfd17f6ccdd3509300469</id>
<content type='text'>
Add helpers to handle MES compute queue resets when multiple queues
are affected.  Can you be used by both KGD and KFD.

v2: sqaush in updates
v3: squash in userq updates

Co-developed-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Co-developed-by: Amber Lin &lt;Amber.Lin@amd.com&gt;
Signed-off-by: Amber Lin &lt;Amber.Lin@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Use a common KGQ and KCQ reset helper for gfx11/12</title>
<updated>2026-06-17T19:51:35+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2026-05-19T20:32:59+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/linux-next.git/commit/?id=b5ded0313519e7f84e4f20ba956843a212ab821b'/>
<id>urn:sha1:b5ded0313519e7f84e4f20ba956843a212ab821b</id>
<content type='text'>
They are all the same so use a common implementation.

Reviewed-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
