summaryrefslogtreecommitdiff
path: root/Documentation/gpu
diff options
context:
space:
mode:
authorDave Airlie <airlied@redhat.com>2023-10-16 10:40:31 +1000
committerDave Airlie <airlied@redhat.com>2023-10-16 10:40:33 +1000
commitd32ce5ab7b52e372c40bf792b53853934000a33f (patch)
treee28bc6396f0ef5ab219dcb0bc3d5bff86c8f11c9 /Documentation/gpu
parent389af786f92ecdff35883551d54bf4e507ffcccb (diff)
parentc395c83aafbb9cdbe4230f044d5b8eaf9080c0c5 (diff)
downloadlwn-d32ce5ab7b52e372c40bf792b53853934000a33f.tar.gz
lwn-d32ce5ab7b52e372c40bf792b53853934000a33f.zip
Merge tag 'drm-misc-next-2023-10-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.7-rc1: Contains the previous pull request drm-misc-next-2023-10-06 + following: Cross-subsystem Changes: - Rename fb_pgprot to pgprot_framebuffer and remove file argument/ - Update iosys-map documentation typos. Core Changes: - Assorted fixes to drm/panel. - Add HPD state to drm_connector_oob_hotplug_event(), and implement oob hotplug events in bridge connector. - Replace drm_framebuffer_plane_width/height with calls to drm_format_info_plane_width/height. Driver Changes: - Clock and debug fixes for bridge/samsung-dsim. - More btree -> maple tree conversions. - Assorted bugfixes in rockchip, panel-tpo-tpg110, - Add LTK050H3148W-CTA6 panel support. - Assorted small fixes in host1x, tegra, simpledrm. - Suspend fixes for host1x. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/3812345e-b086-4d72-8504-f58d84e8feab@linux.intel.com
Diffstat (limited to 'Documentation/gpu')
-rw-r--r--Documentation/gpu/drivers.rst1
-rw-r--r--Documentation/gpu/drm-mm.rst20
-rw-r--r--Documentation/gpu/drm-uapi.rst77
-rw-r--r--Documentation/gpu/drm-usage-stats.rst1
-rw-r--r--Documentation/gpu/panfrost.rst40
5 files changed, 129 insertions, 10 deletions
diff --git a/Documentation/gpu/drivers.rst b/Documentation/gpu/drivers.rst
index 3a52f48215a3..45a12e552091 100644
--- a/Documentation/gpu/drivers.rst
+++ b/Documentation/gpu/drivers.rst
@@ -18,6 +18,7 @@ GPU Driver Documentation
xen-front
afbc
komeda-kms
+ panfrost
.. only:: subproject and html
diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index c19b34b1c0ed..602010cb6894 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,40 +466,40 @@ DRM MM Range Allocator Function References
.. kernel-doc:: drivers/gpu/drm/drm_mm.c
:export:
-DRM GPU VA Manager
-==================
+DRM GPUVM
+=========
Overview
--------
-.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+.. kernel-doc:: drivers/gpu/drm/drm_gpuvm.c
:doc: Overview
Split and Merge
---------------
-.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+.. kernel-doc:: drivers/gpu/drm/drm_gpuvm.c
:doc: Split and Merge
Locking
-------
-.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+.. kernel-doc:: drivers/gpu/drm/drm_gpuvm.c
:doc: Locking
Examples
--------
-.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+.. kernel-doc:: drivers/gpu/drm/drm_gpuvm.c
:doc: Examples
-DRM GPU VA Manager Function References
---------------------------------------
+DRM GPUVM Function References
+-----------------------------
-.. kernel-doc:: include/drm/drm_gpuva_mgr.h
+.. kernel-doc:: include/drm/drm_gpuvm.h
:internal:
-.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+.. kernel-doc:: drivers/gpu/drm/drm_gpuvm.c
:export:
DRM Buddy Allocator
diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index eef5fd19bc92..632989df3727 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -285,6 +285,83 @@ for GPU1 and GPU2 from different vendors, and a third handler for
mmapped regular files. Threads cause additional pain with signal
handling as well.
+Device reset
+============
+
+The GPU stack is really complex and is prone to errors, from hardware bugs,
+faulty applications and everything in between the many layers. Some errors
+require resetting the device in order to make the device usable again. This
+section describes the expectations for DRM and usermode drivers when a
+device resets and how to propagate the reset status.
+
+Device resets can not be disabled without tainting the kernel, which can lead to
+hanging the entire kernel through shrinkers/mmu_notifiers. Userspace role in
+device resets is to propagate the message to the application and apply any
+special policy for blocking guilty applications, if any. Corollary is that
+debugging a hung GPU context require hardware support to be able to preempt such
+a GPU context while it's stopped.
+
+Kernel Mode Driver
+------------------
+
+The KMD is responsible for checking if the device needs a reset, and to perform
+it as needed. Usually a hang is detected when a job gets stuck executing. KMD
+should keep track of resets, because userspace can query any time about the
+reset status for a specific context. This is needed to propagate to the rest of
+the stack that a reset has happened. Currently, this is implemented by each
+driver separately, with no common DRM interface. Ideally this should be properly
+integrated at DRM scheduler to provide a common ground for all drivers. After a
+reset, KMD should reject new command submissions for affected contexts.
+
+User Mode Driver
+----------------
+
+After command submission, UMD should check if the submission was accepted or
+rejected. After a reset, KMD should reject submissions, and UMD can issue an
+ioctl to the KMD to check the reset status, and this can be checked more often
+if the UMD requires it. After detecting a reset, UMD will then proceed to report
+it to the application using the appropriate API error code, as explained in the
+section below about robustness.
+
+Robustness
+----------
+
+The only way to try to keep a graphical API context working after a reset is if
+it complies with the robustness aspects of the graphical API that it is using.
+
+Graphical APIs provide ways to applications to deal with device resets. However,
+there is no guarantee that the app will use such features correctly, and a
+userspace that doesn't support robust interfaces (like a non-robust
+OpenGL context or API without any robustness support like libva) leave the
+robustness handling entirely to the userspace driver. There is no strong
+community consensus on what the userspace driver should do in that case,
+since all reasonable approaches have some clear downsides.
+
+OpenGL
+~~~~~~
+
+Apps using OpenGL should use the available robust interfaces, like the
+extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES). This
+interface tells if a reset has happened, and if so, all the context state is
+considered lost and the app proceeds by creating new ones. There's no consensus
+on what to do to if robustness is not in use.
+
+Vulkan
+~~~~~~
+
+Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for submissions.
+This error code means, among other things, that a device reset has happened and
+it needs to recreate the contexts to keep going.
+
+Reporting causes of resets
+--------------------------
+
+Apart from propagating the reset through the stack so apps can recover, it's
+really useful for driver developers to learn more about what caused the reset in
+the first place. DRM devices should make use of devcoredump to store relevant
+information about the reset, so this information can be added to user bug
+reports.
+
.. _drm_driver_ioctl:
IOCTL Support on Device Nodes
diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 044e6b2ed1be..7aca5c7a7b1d 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -169,3 +169,4 @@ Driver specific implementations
-------------------------------
:ref:`i915-usage-stats`
+:ref:`panfrost-usage-stats`
diff --git a/Documentation/gpu/panfrost.rst b/Documentation/gpu/panfrost.rst
new file mode 100644
index 000000000000..b80e41f4b2c5
--- /dev/null
+++ b/Documentation/gpu/panfrost.rst
@@ -0,0 +1,40 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+=========================
+ drm/Panfrost Mali Driver
+=========================
+
+.. _panfrost-usage-stats:
+
+Panfrost DRM client usage stats implementation
+==============================================
+
+The drm/Panfrost driver implements the DRM client usage stats specification as
+documented in :ref:`drm-client-usage-stats`.
+
+Example of the output showing the implemented key value pairs and entirety of
+the currently possible format options:
+
+::
+ pos: 0
+ flags: 02400002
+ mnt_id: 27
+ ino: 531
+ drm-driver: panfrost
+ drm-client-id: 14
+ drm-engine-fragment: 1846584880 ns
+ drm-cycles-fragment: 1424359409
+ drm-maxfreq-fragment: 799999987 Hz
+ drm-curfreq-fragment: 799999987 Hz
+ drm-engine-vertex-tiler: 71932239 ns
+ drm-cycles-vertex-tiler: 52617357
+ drm-maxfreq-vertex-tiler: 799999987 Hz
+ drm-curfreq-vertex-tiler: 799999987 Hz
+ drm-total-memory: 290 MiB
+ drm-shared-memory: 0 MiB
+ drm-active-memory: 226 MiB
+ drm-resident-memory: 36496 KiB
+ drm-purgeable-memory: 128 KiB
+
+Possible `drm-engine-` key names are: `fragment`, and `vertex-tiler`.
+`drm-curfreq-` values convey the current operating frequency for that engine.