| Age | Commit message (Collapse) | Author |
|
Add support for workload submission trace points.
Co-developed-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Alexandru Dadu <alexandru.dadu@imgtec.com>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://patch.msgid.link/20260513-b4-pvr-trace-points-v1-1-81222d1a4c99@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
Verify the job’s fence in the timeout handler; if the firmware has since
signaled completion, then report NO HANG.
Signed-off-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://patch.msgid.link/20260519-b4-context_reset-v2-2-931018a7131d@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
Remove member no longer used by the scheduler core.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Frank Binns <frank.binns@imgtec.com>
Cc: Matt Coster <matt.coster@imgtec.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patch.msgid.link/20260417103744.76020-21-tvrtko.ursulin@igalia.com
|
|
Mixed list of clarifications and typo fixes.
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Link: https://patch.msgid.link/20260330-job-submission-fixes-cleanup-v1-8-7de8c09cef8c@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
By the time prepare_job() is called on a paired fragment job, the paired
geometry job might already be finished and its PM reference dropped.
Check the fragment job's PM reference instead which is a bit more likely
to be still set. This is a very minor optimization.
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Link: https://patch.msgid.link/20260330-job-submission-fixes-cleanup-v1-7-7de8c09cef8c@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
This should make the code slightly clearer.
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Link: https://patch.msgid.link/20260330-job-submission-fixes-cleanup-v1-6-7de8c09cef8c@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
This function is only used by the synchronization code to figure out if
a fence belongs to this driver.
Rename it to pvr_queue_fence_is_native() and update its documentation to
reflect its current purpose.
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Link: https://patch.msgid.link/20260330-job-submission-fixes-cleanup-v1-4-7de8c09cef8c@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
While submitting a paired fragment job, there is no need to manually
look for, and skip, the paired job fence, as the existing logic to
resolve dependencies to pvr_queue_fence objects will have failed to
resolve it already and continued with the next one.
Point this out where the fence is actually accessed and drop the related
check.
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Link: https://patch.msgid.link/20260330-job-submission-fixes-cleanup-v1-3-7de8c09cef8c@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
For geometry jobs with a paired fragment job, at the moment, the
DRM scheduler's prepare_job() callback:
- checks for internal (driver) dependencies for the geometry job;
- calls into pvr_queue_get_paired_frag_job_dep() to check for external
dependencies for the fragment job (the two jobs are submitted together
but the common scheduler code doesn't know about it, so this needs to
be done at this point in time);
- calls into the prepare_job() callback again, but for the fragment job,
to check its internal dependencies as well, passing the fragment job's
drm_sched_job and the geometry job's drm_sched_entity / pvr_queue.
The problem with the last step is that pvr_queue_prepare_job() doesn't
always take the mismatched fragment job and geometry queue into account,
in particular when checking whether there is space for the fragment
command to be submitted, so the code ends up checking for space in the
geometry (i.e. wrong) CCCB.
The rest of the nested prepare_job() callback happens to work fine at
the moment as the other internal dependencies are not relevant for a
paired fragment job.
Move the initialisation of a paired fragment job's done fence and CCCB
fence to pvr_queue_get_paired_frag_job_dep(), inferring the correct
queue from the fragment job itself.
This fixes cases where prepare_job() wrongly assumed that there was
enough space for a paired fragment job in its own CCCB, unblocking
run_job(), which then returned early without writing the full sequence
of commands to the CCCB.
The above lead to kernel warnings such as the following and potentially
job timeouts (depending on waiters on the missing commands):
[ 552.421075] WARNING: drivers/gpu/drm/imagination/pvr_cccb.c:178 at pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr], CPU#2: kworker/u16:5/63
[ 552.421230] Modules linked in:
[ 552.421592] CPU: 2 UID: 0 PID: 63 Comm: kworker/u16:5 Tainted: G W 7.0.0-rc2-gc5d053e4dccb #39 PREEMPT
[ 552.421625] Tainted: [W]=WARN
[ 552.421637] Hardware name: Texas Instruments AM625 SK (DT)
[ 552.421655] Workqueue: powervr-sched drm_sched_run_job_work [gpu_sched]
[ 552.421744] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 552.421766] pc : pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr]
[ 552.421850] lr : pvr_queue_submit_job_to_cccb+0x57c/0xa74 [powervr]
[ 552.421923] sp : ffff800084c47650
[ 552.421936] x29: ffff800084c47740 x28: 0000000000000df8 x27: ffff800088a77000
[ 552.421979] x26: 0000000000000030 x25: ffff800084c47680 x24: 0000000000001000
[ 552.422017] x23: ffff800084c47820 x22: 1ffff00010988ecc x21: 0000000000000008
[ 552.422055] x20: 0000000000000208 x19: ffff000006ad5a88 x18: 0000000000000000
[ 552.422093] x17: 0000000020020000 x16: 0000000000020000 x15: 0000000000000000
[ 552.422130] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 552.422167] x11: 000000000000f2f2 x10: 00000000f3000000 x9 : 00000000f3f3f3f3
[ 552.422204] x8 : 00000000f2f2f200 x7 : ffff700010988ecc x6 : 0000000000000008
[ 552.422241] x5 : 0000000000000000 x4 : 1ffff0001114ee00 x3 : 0000000000000000
[ 552.422278] x2 : 0000000000000007 x1 : 0000000000000fff x0 : 000000000000002f
[ 552.422316] Call trace:
[ 552.422330] pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr] (P)
[ 552.422411] pvr_queue_submit_job_to_cccb+0x57c/0xa74 [powervr]
[ 552.422486] pvr_queue_run_job+0x3a4/0x990 [powervr]
[ 552.422562] drm_sched_run_job_work+0x580/0xd48 [gpu_sched]
[ 552.422623] process_one_work+0x520/0x1288
[ 552.422657] worker_thread+0x3f0/0xb3c
[ 552.422679] kthread+0x334/0x3d8
[ 552.422706] ret_from_fork+0x10/0x20
Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling")
Cc: stable@vger.kernel.org
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Link: https://patch.msgid.link/20260330-job-submission-fixes-cleanup-v1-2-7de8c09cef8c@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
The DRM scheduler's prepare_job() callback counts the remaining
non-signaled native dependencies for a job, preventing job submission
until those (plus job data and fence update) can fit in the job queue's
CCCB.
This means checking which dependencies can be waited upon in the
firmware, i.e. whether they are backed by a UFO object, i.e. whether
their drm_sched_fence::parent has been assigned to a
pvr_queue_fence::base fence. That happens when the job owning the fence
is submitted to the firmware.
Paired geometry and fragment jobs are submitted at the same time, which
means the dependency between them can't be checked this way before
submission.
Update job_count_remaining_native_deps() to take into account the
dependency between paired jobs.
This fixes cases where prepare_job() underestimated the space left in
an almost full fragment CCCB, wrongly unblocking run_job(), which then
returned early without writing the full sequence of commands to the
CCCB.
The above lead to kernel warnings such as the following and potentially
job timeouts (depending on waiters on the missing commands):
[ 375.702979] WARNING: drivers/gpu/drm/imagination/pvr_cccb.c:178 at pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr], CPU#1: kworker/u16:3/47
[ 375.703160] Modules linked in:
[ 375.703571] CPU: 1 UID: 0 PID: 47 Comm: kworker/u16:3 Tainted: G W 7.0.0-rc2-g817eb6b11ad5 #40 PREEMPT
[ 375.703613] Tainted: [W]=WARN
[ 375.703627] Hardware name: Texas Instruments AM625 SK (DT)
[ 375.703645] Workqueue: powervr-sched drm_sched_run_job_work [gpu_sched]
[ 375.703741] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 375.703764] pc : pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr]
[ 375.703847] lr : pvr_queue_submit_job_to_cccb+0x578/0xa70 [powervr]
[ 375.703921] sp : ffff800084a97650
[ 375.703934] x29: ffff800084a97740 x28: 0000000000000958 x27: ffff80008565d000
[ 375.703979] x26: 0000000000000030 x25: ffff800084a97680 x24: 0000000000001000
[ 375.704017] x23: ffff800084a97820 x22: 1ffff00010952ecc x21: 0000000000000008
[ 375.704056] x20: 00000000000006a8 x19: ffff00002ff7da88 x18: 0000000000000000
[ 375.704093] x17: 0000000020020000 x16: 0000000000020000 x15: 0000000000000000
[ 375.704132] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 375.704168] x11: 000000000000f2f2 x10: 00000000f3000000 x9 : 00000000f3f3f3f3
[ 375.704206] x8 : 00000000f2f2f200 x7 : ffff700010952ecc x6 : 0000000000000008
[ 375.704243] x5 : 0000000000000000 x4 : 1ffff00010acba00 x3 : 0000000000000000
[ 375.704279] x2 : 0000000000000007 x1 : 0000000000000fff x0 : 000000000000002f
[ 375.704317] Call trace:
[ 375.704331] pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr] (P)
[ 375.704411] pvr_queue_submit_job_to_cccb+0x578/0xa70 [powervr]
[ 375.704487] pvr_queue_run_job+0x3a4/0x990 [powervr]
[ 375.704562] drm_sched_run_job_work+0x580/0xd48 [gpu_sched]
[ 375.704623] process_one_work+0x520/0x1288
[ 375.704658] worker_thread+0x3f0/0xb3c
[ 375.704680] kthread+0x334/0x3d8
[ 375.704706] ret_from_fork+0x10/0x20
[ 375.704736] ---[ end trace 0000000000000000 ]---
Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling")
Cc: stable@vger.kernel.org
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Brajesh Gupta <brajesh.gupta@imgtec.com>
Link: https://patch.msgid.link/20260330-job-submission-fixes-cleanup-v1-1-7de8c09cef8c@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Among the scheduler's statuses, the only one that indicates an error is
DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV
signifies that the operation succeeded and the GPU is in a nominal state.
However, to provide more information about the GPU's status, it is needed
to convey more information than just "OK".
Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to
DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this
status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has
hung, but it has been successfully reset and is now in a nominal state
again.
Reviewed-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
|
|
This will be used in a later commit to trace the drm client_id in
some of the gpu_scheduler trace events.
This requires changing all the users of drm_sched_job_init to
add an extra parameter.
The newly added drm_client_id field in the drm_sched_fence is a bit
of a duplicate of the owner one. One suggestion I received was to
merge those 2 fields - this can't be done right now as amdgpu uses
some special values (AMDGPU_FENCE_OWNER_*) that can't really be
translated into a client id. Christian is working on getting rid of
those; when it's done we should be able to squash owner/drm_client_id
together.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250526125505.2360-3-pierre-eric.pelloux-prayer@amd.com
|
|
Backmerging to get updates from v6.15-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
For paired jobs, have the fragment job take a reference on the
geometry job, so that the geometry job cannot be freed until
the fragment job has finished with it.
The geometry job structure is accessed when the fragment job is being
prepared by the GPU scheduler. Taking the reference prevents the
geometry job being freed until the fragment job no longer requires it.
Fixes a use after free bug detected by KASAN:
[ 124.256386] BUG: KASAN: slab-use-after-free in pvr_queue_prepare_job+0x108/0x868 [powervr]
[ 124.264893] Read of size 1 at addr ffff0000084cb960 by task kworker/u16:4/63
Cc: stable@vger.kernel.org
Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling")
Signed-off-by: Brendan King <brendan.king@imgtec.com>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://lore.kernel.org/r/20250318-ddkopsrc-1337-use-after-free-in-pvr_queue_prepare_job-v1-1-80fb30d044a6@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
This is a backmerge from Linux 6.14-rc6, needed for the nova PR.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Ensure job done fences are only initialised once.
This fixes a memory manager not clean warning from drm_mm_takedown
on module unload.
Cc: stable@vger.kernel.org
Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling")
Signed-off-by: Brendan King <brendan.king@imgtec.com>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250226-init-done-fences-once-v2-1-c1b2f556b329@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
Do scheduler queue fence release processing on a workqueue, rather
than in the release function itself.
Fixes deadlock issues such as the following:
[ 607.400437] ============================================
[ 607.405755] WARNING: possible recursive locking detected
[ 607.415500] --------------------------------------------
[ 607.420817] weston:zfq0/24149 is trying to acquire lock:
[ 607.426131] ffff000017d041a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: pvr_gem_object_vunmap+0x40/0xc0 [powervr]
[ 607.436728]
but task is already holding lock:
[ 607.442554] ffff000017d105a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_ioctl+0x250/0x554
[ 607.451727]
other info that might help us debug this:
[ 607.458245] Possible unsafe locking scenario:
[ 607.464155] CPU0
[ 607.466601] ----
[ 607.469044] lock(reservation_ww_class_mutex);
[ 607.473584] lock(reservation_ww_class_mutex);
[ 607.478114]
*** DEADLOCK ***
Cc: stable@vger.kernel.org
Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling")
Signed-off-by: Brendan King <brendan.king@imgtec.com>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250226-fence-release-deadlock-v2-1-6fed2fc1fe88@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
drm_sched_init() has a great many parameters and upcoming new
functionality for the scheduler might add even more. Generally, the
great number of parameters reduces readability and has already caused
one missnaming, addressed in:
commit 6f1cacf4eba7 ("drm/nouveau: Improve variable name in
nouveau_sched_init()").
Introduce a new struct for the scheduler init parameters and port all
users.
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv
Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched
Reviewed-by: Maíra Canal <mcanal@igalia.com> # for v3d
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org
|
|
The current implementation of drm_sched_start uses a hardcoded
-ECANCELED to dispose of a job when the parent/hw fence is NULL.
This results in drm_sched_job_done being called with -ECANCELED for
each job with a NULL parent in the pending list, making it difficult
to distinguish between recovery methods, whether a queue reset or a
full GPU reset was used.
To improve this, we first try a soft recovery for timeout jobs and
use the error code -ENODATA. If soft recovery fails, we proceed with
a queue reset, where the error code remains -ENODATA for the job.
Finally, for a full GPU reset, we use error codes -ECANCELED or
-ETIME. This patch adds an error code parameter to drm_sched_start,
allowing us to differentiate between queue reset and GPU reset
failures. This enables user mode and test applications to validate
the expected correctness of the requested operation. After a
successful queue reset, the only way to continue normal operation is
to call drm_sched_job_done with the specific error code -ENODATA.
v1: Initial implementation by Jesse utilized amdgpu_device_lock_reset_domain
and amdgpu_device_unlock_reset_domain to allow user mode to track
the queue reset status and distinguish between queue reset and
GPU reset.
v2: Christian suggested using the error codes -ENODATA for queue reset
and -ECANCELED or -ETIME for GPU reset, returned to
amdgpu_cs_wait_ioctl.
v3: To meet the requirements, we introduce a new function
drm_sched_start_ex with an additional parameter to set
dma_fence_set_error, allowing us to handle the specific error
codes appropriately and dispose of bad jobs with the selected
error code depending on whether it was a queue reset or GPU reset.
v4: Alex suggested using a new name, drm_sched_start_with_recovery_error,
which more accurately describes the function's purpose.
Additionally, it was recommended to add documentation details
about the new method.
v5: Fixed declaration of new function drm_sched_start_with_recovery_error.(Alex)
v6 (chk): rebase on upstream changes, cleanup the commit message,
drop the new function again and update all callers,
apply the errno also to scheduler fences with hw fences
v7 (chk): rebased
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240826122541.85663-1-christian.koenig@amd.com
|
|
This was basically just another one of amdgpus hacks. The parameter
allowed to restart the scheduler without turning fence signaling on
again.
That this is absolutely not a good idea should be obvious by now since
the fences will then just sit there and never signal.
While at it cleanup the code a bit.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722083816.99685-1-christian.koenig@amd.com
|
|
Fix compilation issues with DRM scheduler priority rename MIN to LOW.
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311252109.WgbJsSkG-lkp@intel.com/
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Frank Binns <frank.binns@imgtec.com>
Cc: Donald Robson <donald.robson@imgtec.com>
Cc: Matt Coster <matt.coster@imgtec.com>
Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org>
Fixes: fe375c74806dbd ("drm/sched: Rename priority MIN to LOW")
Fixes: 38f922a563aac3 ("drm/sched: Reverse run-queue priority enumeration")
Fixes: 5f03a507b29e44 ("drm/nouveau: implement 1:1 scheduler - entity relationship")
Link: https://patchwork.freedesktop.org/patch/msgid/20231125192246.87268-2-ltuikov89@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/7429262c-6dea-4dcc-bf7e-54d2277dabf1@amd.com
|
|
Implement job submission ioctl. Job scheduling is implemented using
drm_sched.
Jobs are submitted in a stream format. This is intended to allow the UAPI
data format to be independent of the actual FWIF structures in use, which
vary depending on the GPU in use.
The stream formats are documented at:
https://gitlab.freedesktop.org/mesa/mesa/-/blob/f8d2b42ae65c2f16f36a43e0ae39d288431e4263/src/imagination/csbgen/rogue_kmd_stream.xml
Changes since v8:
- Updated for upstreamed DRM scheduler changes
- Removed workaround code for the pending_list previously being updated
after run_job() returned
- Fixed null deref in pvr_queue_cleanup_fw_context() for bad stream ptr
given to create_context ioctl
- Corrected license identifiers
Changes since v7:
- Updated for v8 "DRM scheduler changes for XE" patchset
Changes since v6:
- Fix fence handling in pvr_sync_signal_array_add()
- Add handling for SUBMIT_JOB_FRAG_CMD_DISABLE_PIXELMERGE flag
- Fix missing dma_resv locking in job submit path
Changes since v5:
- Fix leak in job creation error path
Changes since v4:
- Use a regular workqueue for job scheduling
Changes since v3:
- Support partial render jobs
- Add job timeout handler
- Split sync handling out of job code
- Use drm_dev_{enter,exit}
Changes since v2:
- Use drm_sched for job scheduling
Co-developed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Co-developed-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Sarah Walker <sarah.walker@imgtec.com>
Link: https://lore.kernel.org/r/c98dab7a5f5fb891fbed7e4990d19b5d13964365.1700668843.git.donald.robson@imgtec.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|