summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/ttm/ttm_device.c
AgeCommit message (Collapse)Author
2023-10-16drm/ttm: Reorder sys manager cleanup stepKarolina Stolarek
With the current cleanup flow, we could trigger a NULL pointer dereference if there is a delayed destruction of a BO with a system resource that gets executed on drain_workqueue() call, as we attempt to free a resource using an already released resource manager. Remove the device from the device list and drain its workqueue before releasing the system domain manager in ttm_device_fini(). Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231016121525.2237838-1-karolina.stolarek@intel.com Signed-off-by: Christian König <christian.koenig@amd.com>
2023-06-15Merge tag 'amd-drm-next-6.5-2023-06-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.5-2023-06-02: amdgpu: - SR-IOV fixes - Warning fixes - Misc code cleanups and spelling fixes - DCN 3.2 updates - Improved DC FAMS support for better power management - Improved DC SubVP support for better power management - DCN 3.1.x fixes - Max IB size query - DC GPU reset fixes - RAS updates - DCN 3.0.x fixes - S/G display fixes - CP shadow buffer support - Implement connector force callback - Z8 power improvements - PSP 13.0.10 vbflash support - Mode2 reset fixes - Store MQDs in VRAM to improve queue switch latency - VCN 3.x fixes - JPEG 3.x fixes - Enable DC_FP on LoongArch - GFXOFF fixes - GC 9.4.3 partition support - SDMA 4.4.2 partition support - VCN/JPEG 4.0.3 partition support - VCN 4.0.3 updates - NBIO 7.9 updates - GC 9.4.3 updates - Take NUMA into account when allocating memory - Handle NUMA for partitions - SMU 13.0.6 updates - GC 9.4.3 RAS updates - Stop including unused swiotlb.h - SMU 13.0.7 fixes - Fix clock output ordering on some APUs - Clean up DC FPGA code - GFX9 preemption fixes - Misc irq fixes - S0ix fixes - Add new DRM_AMDGPU_WERROR config parameter to help with CI - PCIe fix for RDNA2 - kdoc fixes - Documentation updates amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions radeon: - Fix possible double free - Stop including unused swiotlb.h - Fix possible division by zero ttm: - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() UAPI: - Add new ctx query flag to better handle GPU resets Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22290 - Add new interface to query and set shadow buffer for RDNA3 Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21986 - Add new INFO query for max IB size Proposed userspace: https://gitlab.freedesktop.org/bnieuwenhuizen/mesa/-/commits/ib-rejection-v3 amd-drm-next-6.5-2023-06-09: amdgpu: - S0ix fixes - Initial SMU13 Overdrive support - kdoc fixes - Misc clode cleanups - Flexible array fixes - Display OTG fixes - SMU 13.0.6 updates - Revert some broken clock counter updates - Misc display fixes - GFX9 preemption fixes - Add support for newer EEPROM bad page table format - Add missing radeon secondary id - Add support for new colorspace KMS API - CSA fix - Stable pstate fixes for APUs - make vbl interface admin only - Handle PCI accelerator class amdkfd: - Add debugger support for gdb radeon: - Fix possible UAF drm: - Add Colorspace functionality UAPI: - Add debugger interface for enabling gdb Proposed userspace: https://github.com/ROCm-Developer-Tools/ROCdbgapi/tree/wip-dbgapi - Add KMS colorspace API Discussion: https://lists.freedesktop.org/archives/dri-devel/2023-June/408128.html From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609174817.7764-1-alexander.deucher@amd.com
2023-06-09drm/ttm: add NUMA node id to the poolRajneesh Bhardwaj
This allows backing ttm_tt structure with pages from different NUMA pools. Tested-by: Graham Sider <graham.sider@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-17drm/ttm: let struct ttm_device_funcs be placed in rodataJani Nikula
Make the struct ttm_device_funcs pointers const so the data can be placed in rodata. Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230309123700.528641-1-jani.nikula@intel.com
2023-03-29Merge v6.3-rc4 into drm-nextDaniel Vetter
I just landed the fence deadline PR from Rob that a bunch of drivers want/need to apply driver-specific patches. Backmerge -rc4 so that they don't have to be stuck on -rc2 for no reason at all. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2023-03-13Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann
Backmerging to get latest upstream. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2023-03-09drm/ttm: Unexport ttm_global_swapout()Thomas Hellström
Unexport ttm_global_swapout() since it is not used outside of TTM. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230307144621.10748-5-thomas.hellstrom@linux.intel.com
2023-03-07drm/ttm: Fix a NULL pointer dereferenceThomas Hellström
The LRU mechanism may look up a resource in the process of being removed from an object. The locking rules here are a bit unclear but it looks currently like res->bo assignment is protected by the LRU lock, whereas bo->resource is protected by the object lock, while *clearing* of bo->resource is also protected by the LRU lock. This means that if we check that bo->resource points to the LRU resource under the LRU lock we should be safe. So perform that check before deciding to swap out a bo. That avoids dereferencing a NULL bo->resource in ttm_bo_swapout(). Fixes: 6a9b02899402 ("drm/ttm: move the LRU into resource handling v4") Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Philip Yang <Philip.Yang@amd.com> Cc: Qiang Yu <qiang.yu@amd.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.19+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230307144621.10748-2-thomas.hellstrom@linux.intel.com
2022-12-06drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h v2Christian König
Merge and cleanup the two headers into a single description of the object API. Also move all the documentation to the implementation and drop unnecessary includes from the header. No functional change. v2: minimal checkpatch.pl cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-4-christian.koenig@amd.com
2022-12-06drm/ttm: use per BO cleanup workersChristian König
Instead of a single worker going over the list of delete BOs in regular intervals use a per BO worker which blocks for the resv object and locking of the BO. This not only simplifies the handling massively, but also results in much better response time when cleaning up buffers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-3-christian.koenig@amd.com
2022-06-10drm/ttm: fix missing NULL check in ttm_device_swapoutChristian König
Resources about to be destructed are not tied to BOs any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Fixes: 6a9b02899402 ("drm/ttm: move the LRU into resource handling v4") Link: https://patchwork.freedesktop.org/patch/msgid/20220603104604.456991-1-christian.koenig@amd.com
2022-03-28drm/ttm: add resource iterator v4Christian König
Instead of duplicating that at different places add an iterator over all the resources in a resource manager. v2: add lockdep annotation and kerneldoc v3: fix various bugs pointed out by Felix v4: simplify the code a bit more Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2) Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-2-christian.koenig@amd.com
2022-03-28drm/ttm: move the LRU into resource handling v4Christian König
This way we finally fix the problem that new resource are not immediately evict-able after allocation. That has caused numerous problems including OOM on GDS handling and not being able to use TTM as general resource manager. v2: stop assuming in ttm_resource_fini that res->bo is still valid. v3: cleanup kerneldoc, add more lockdep annotation v4: consistently use res->num_pages Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-1-christian.koenig@amd.com
2021-09-14Merge drm/drm-next into drm-misc-nextMaxime Ripard
Kickstart new drm-misc-next cycle. Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-09-01drm/ttm: Clear all DMA mappings on demandAndrey Grodzovsky
Used by drivers supporting hot unplug to handle all DMA IOMMU group related dependencies before the group is removed during device removal and we try to access it after free when last device pointer from user space is dropped. v3: Switch to ttm_bo_get_unless_zerom Iterate bdev for pinned list Switch to ttm_tt_unpopulate Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210827203910.5565-3-andrey.grodzovsky@amd.com
2021-08-30drm/ttm: Create pinned listAndrey Grodzovsky
This list will be used to capture all non VRAM BOs not on LRU so when device is hot unplugged we can iterate the list and unmap DMA mappings before device is removed. v2: Reanme function to ttm_bo_move_to_pinned v3: Move the pinned list to ttm device Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/451614/?series=93971
2021-08-16drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir failsDan Moulding
In 69de4421bb4c ("drm/ttm: Initialize debugfs from ttm_global_init()"), ttm_global_init was changed so that if creation of the debugfs global root directory fails, ttm_global_init will bail out early and return an error, leading to initialization failure of DRM drivers. However, not every system will be using debugfs. On such a system, debugfs directory creation can be expected to fail, but DRM drivers must still be usable. This changes it so that if creation of TTM's debugfs root directory fails, then no biggie: keep calm and carry on. Fixes: 69de4421bb4c ("drm/ttm: Initialize debugfs from ttm_global_init()") Signed-off-by: Dan Moulding <dmoulding@me.com> Tested-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210810195906.22220-2-dmoulding@me.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-22drm/ttm: Initialize debugfs from ttm_global_init()Jason Ekstrand
We create a bunch of debugfs entries as a side-effect of ttm_global_init() and then never clean them up. This isn't usually a problem because we free the whole debugfs directory on module unload. However, if the global reference count ever goes to zero and then ttm_global_init() is called again, we'll re-create those debugfs entries and debugfs will complain in dmesg that we're creating entries that already exist. This patch fixes this problem by changing the lifetime of the whole TTM debugfs directory to match that of the TTM global state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210721152358.2893314-6-jason@jlekstrand.net
2021-07-21drm/ttm: Force re-init if ttm_global_init() failsJason Ekstrand
If we have a failure, decrement the reference count so that the next call to ttm_global_init() will actually do something instead of assume everything is all set up. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: 62b53b37e4b1 ("drm/ttm: use a static ttm_bo_global instance") Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210720181357.2760720-5-jason@jlekstrand.net Signed-off-by: Christian König <christian.koenig@amd.com>
2021-06-23Backmerge tag 'v5.13-rc7' into drm-nextDave Airlie
Backmerge Linux 5.13-rc7 to make some pulls from later bases apply, and to bake in the conflicts so far.
2021-06-08drm/ttm: fix deref of bo->ttm without holding the lock v2Christian König
We need to grab the resv lock first before doing that check. v2 (chk): simplify the change for -fixes Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210528130041.1683-1-christian.koenig@amd.com
2021-05-26drm/ttm: Skip swapout if ttm object is not populatedxinhui pan
Swapping a ttm object which has no backend pages makes no sense. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210521083112.33176-1-xinhui.pan@amd.com CC: stable@kernel.org Signed-off-by: Christian König <christian.koenig@amd.com>
2021-05-03drm/ttm: add ttm_sys_manager v3Christian König
Add a separate manager for the system domain and make function tables mandatory. v2: debug is still optional v3: return void during init Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210430092508.60710-1-christian.koenig@amd.com
2021-04-23drm/ttm: fix error handling if no BO can be swapped out v4Shiwu Zhang
In case that all pre-allocated BOs are busy, just continue to populate BOs since likely half of system memory in total is still free. v4 (chk): fix code moved to VMWGFX as well Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210422115757.3946-1-christian.koenig@amd.com
2021-04-23drm/ttm: fix error handling if no BO can be swapped out v4Shiwu Zhang
In case that all pre-allocated BOs are busy, just continue to populate BOs since likely half of system memory in total is still free. v4 (chk): fix code moved to VMWGFX as well Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210422115757.3946-1-christian.koenig@amd.com
2021-04-22drm/ttm/ttm_device: Demote kernel-doc abusesLee Jones
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/ttm/ttm_device.c:42: warning: Function parameter or member 'ttm_global_mutex' not described in 'DEFINE_MUTEX' drivers/gpu/drm/ttm/ttm_device.c:42: warning: expecting prototype for ttm_global_mutex(). Prototype was for DEFINE_MUTEX() instead drivers/gpu/drm/ttm/ttm_device.c:112: warning: Function parameter or member 'ctx' not described in 'ttm_global_swapout' drivers/gpu/drm/ttm/ttm_device.c:112: warning: Function parameter or member 'gfp_flags' not described in 'ttm_global_swapout' drivers/gpu/drm/ttm/ttm_device.c:112: warning: expecting prototype for A buffer object shrink method that tries to swap out the first(). Prototype was for ttm_global_swapout() instead Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210416143725.2769053-28-lee.jones@linaro.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-04-09drm/ttm: make global mutex and use count staticChristian König
Only needed during device hot plug and remove and not exported. Signed-off-by: Christian König <christian.koenig@amd.com> Suggested-by: Bernard <bernard@vivo.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210409110730.2958-1-christian.koenig@amd.com
2021-03-29drm/ttm: switch back to static allocation limits for nowChristian König
The shrinker based approach still has some flaws. Especially that we need temporary pages to free up the pages allocated to the driver is problematic in a shrinker. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324134845.2338-1-christian.koenig@amd.com
2021-03-24drm/ttm: switch to per device LRU lockChristian König
Instead of having a global lock for potentially less contention. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/424010/
2021-03-24drm/ttm: remove swap LRU v3Christian König
Instead evict round robin from each devices SYSTEM and TT domain. v2: reorder num_pages access reported by Dan's script v3: fix rebase fallout, num_pages should be 32bit Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/424009/
2021-03-24drm/ttm: move swapout logic around v3Christian König
Move the iteration of the global lru into the new function ttm_global_swapout() and use that instead in drivers. v2: consistently return int v3: fix build fail Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/424008/
2021-02-09drm/ttm: move memory accounting into vmwgfx v4Christian König
This is just another feature which is only used by VMWGFX, so move it into the driver instead. I've tried to add the accounting sysfs file to the kobject of the drm minor, but I'm not 100% sure if this works as expected. v2: fix typo in KFD and avoid 64bit divide v3: fix init order in VMWGFX v4: use pdev sysfs reference instead of drm Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zack Rusin <zackr@vmware.com> (v3) Tested-by: Nirmoy Das <nirmoy.das@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-2-christian.koenig@amd.com
2021-02-09drm/ttm: fix removal of bo_count sysfs fileChristian König
Only a zombie leftover from rebasing. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 3763d635deaa ("drm/ttm: add debugfs directory v2") Reported-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210209131756.24650-1-christian.koenig@amd.com
2021-01-21drm/ttm: device naming cleanupChristian König
Rename ttm_bo_device to ttm_device. Rename ttm_bo_driver to ttm_device_funcs. Rename ttm_bo_global to ttm_global. Move global and device related functions to ttm_device.[ch]. No functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/415222/