summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/ttm/ttm_pool.c
AgeCommit message (Collapse)Author
2024-07-23drm/ttm: Allow direct reclaim to allocate local memory v2Rajneesh Bhardwaj
Limiting the allocation of higher order pages to the closest NUMA node and enabling direct memory reclaim provides not only failsafe against situations when memory becomes too much fragmented and the allocator is not able to satisfy the request from the local node but falls back to remote pages (HUGEPAGE) but also offers performance improvement. Accessing remote pages suffers due to bandwidth limitations and could be avoided if memory becomes defragmented and in most cases without using manual compaction. (/proc/sys/vm/compact_memory) Note: On certain distros such as RHEL, the proactive compaction is disabled. (https://tinyurl.com/4f32f7rs) v2 (chk): drop __GFP_RECLAIM since that is already set by GFP_USER Cc: Dave Airlie <airlied@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708160636.1147308-1-rajneesh.bhardwaj@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2024-04-15drm/ttm: stop pooling cached NUMA pages v2Christian König
We only pool write combined and uncached allocations because they require extra overhead on allocation and release. If we also pool cached NUMA it not only means some extra unnecessary overhead, but also that under memory pressure it can happen that pages from the wrong NUMA node enters the pool and are re-used over and over again. This can lead to performance reduction after running into memory pressure. v2: restructure and cleanup the code a bit from the internal hack to test this. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 4482d3c94d7f ("drm/ttm: add NUMA node id to the pool") CC: stable@vger.kernel.org Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240415134821.1919-1-christian.koenig@amd.com
2024-02-22drm/ttm: Fix an invalid freeing on already freed page in error pathThomas Hellström
If caching mode change fails due to, for example, OOM we free the allocated pages in a two-step process. First the pages for which the caching change has already succeeded. Secondly the pages for which a caching change did not succeed. However the second step was incorrectly freeing the pages already freed in the first step. Fix. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") Cc: Christian König <christian.koenig@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.4+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240221073324.3303-1-thomas.hellstrom@linux.intel.com
2024-01-08mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08mm, treewide: introduce NR_PAGE_ORDERSKirill A. Shutemov
NR_PAGE_ORDERS defines the number of page orders supported by the page allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total. NR_PAGE_ORDERS assists in defining arrays of page orders and allows for more natural iteration over them. [kirill.shutemov@linux.intel.com: fixup for kerneldoc warning] Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04drm/ttm: dynamically allocate the drm-ttm_pool shrinkerQi Zheng
Use new APIs to dynamically allocate the drm-ttm_pool shrinker. Link: https://lkml.kernel.org/r/20230911094444.68966-5-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Chuck Lever <cel@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nadav Amit <namit@vmware.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sean Paul <sean@poorly.run> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Tom Talpey <tom@talpey.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04drm/ttm: introduce pool_shrink_rwsemQi Zheng
Currently, synchronize_shrinkers() is only used by TTM pool. It only requires that no shrinkers run in parallel. After we use RCU+refcount method to implement the lockless slab shrink, we can not use shrinker_rwsem or synchronize_rcu() to guarantee that all shrinker invocations have seen an update before freeing memory. So we introduce a new pool_shrink_rwsem to implement a private ttm_pool_synchronize_shrinkers(), so as to achieve the same purpose. Link: https://lkml.kernel.org/r/20230911092517.64141-5-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Christian Brauner <brauner@kernel.org> Cc: Chuck Lever <cel@kernel.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Steven Price <steven.price@arm.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Coly Li <colyli@suse.de> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nadav Amit <namit@vmware.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Sean Paul <sean@poorly.run> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Tom Talpey <tom@talpey.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-15Merge tag 'amd-drm-next-6.5-2023-06-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.5-2023-06-02: amdgpu: - SR-IOV fixes - Warning fixes - Misc code cleanups and spelling fixes - DCN 3.2 updates - Improved DC FAMS support for better power management - Improved DC SubVP support for better power management - DCN 3.1.x fixes - Max IB size query - DC GPU reset fixes - RAS updates - DCN 3.0.x fixes - S/G display fixes - CP shadow buffer support - Implement connector force callback - Z8 power improvements - PSP 13.0.10 vbflash support - Mode2 reset fixes - Store MQDs in VRAM to improve queue switch latency - VCN 3.x fixes - JPEG 3.x fixes - Enable DC_FP on LoongArch - GFXOFF fixes - GC 9.4.3 partition support - SDMA 4.4.2 partition support - VCN/JPEG 4.0.3 partition support - VCN 4.0.3 updates - NBIO 7.9 updates - GC 9.4.3 updates - Take NUMA into account when allocating memory - Handle NUMA for partitions - SMU 13.0.6 updates - GC 9.4.3 RAS updates - Stop including unused swiotlb.h - SMU 13.0.7 fixes - Fix clock output ordering on some APUs - Clean up DC FPGA code - GFX9 preemption fixes - Misc irq fixes - S0ix fixes - Add new DRM_AMDGPU_WERROR config parameter to help with CI - PCIe fix for RDNA2 - kdoc fixes - Documentation updates amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions radeon: - Fix possible double free - Stop including unused swiotlb.h - Fix possible division by zero ttm: - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() UAPI: - Add new ctx query flag to better handle GPU resets Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22290 - Add new interface to query and set shadow buffer for RDNA3 Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21986 - Add new INFO query for max IB size Proposed userspace: https://gitlab.freedesktop.org/bnieuwenhuizen/mesa/-/commits/ib-rejection-v3 amd-drm-next-6.5-2023-06-09: amdgpu: - S0ix fixes - Initial SMU13 Overdrive support - kdoc fixes - Misc clode cleanups - Flexible array fixes - Display OTG fixes - SMU 13.0.6 updates - Revert some broken clock counter updates - Misc display fixes - GFX9 preemption fixes - Add support for newer EEPROM bad page table format - Add missing radeon secondary id - Add support for new colorspace KMS API - CSA fix - Stable pstate fixes for APUs - make vbl interface admin only - Handle PCI accelerator class amdkfd: - Add debugger support for gdb radeon: - Fix possible UAF drm: - Add Colorspace functionality UAPI: - Add debugger interface for enabling gdb Proposed userspace: https://github.com/ROCm-Developer-Tools/ROCdbgapi/tree/wip-dbgapi - Add KMS colorspace API Discussion: https://lists.freedesktop.org/archives/dri-devel/2023-June/408128.html From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609174817.7764-1-alexander.deucher@amd.com
2023-06-09drm/ttm: export ttm_pool_fini for cleanupRajneesh Bhardwaj
ttm_pool_init is exported and used outside of ttm subsystem with amdgpu_ttm interface, similarly export ttm_pool_fini for proper cleanup. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/ttm: add NUMA node id to the poolRajneesh Bhardwaj
This allows backing ttm_tt structure with pages from different NUMA pools. Tested-by: Graham Sider <graham.sider@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-27Merge tag 'mm-stable-2023-04-27-15-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...
2023-04-14drm/ttm: revert "Reduce the number of used allocation orders for TTM pages"Christian König
This reverts commit 322458c2bb1a0398c5775333e1e71e1ece8a461f. PMD_SHIFT is not necessary constant on all architectures resulting in build failures. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/CAKMK7uHgUuqWJuqmZKrxi2mNiqExhmMif-naYnzUSj-puW-x+A@mail.gmail.com
2023-04-06drm/ttm: Reduce the number of used allocation orders for TTM pagesThomas Hellström
When swapping out, we will split multi-order pages both in order to move them to the swap-cache and to be able to return memory to the swap cache as soon as possible on a page-by-page basis. Reduce the page max order to the system PMD size, as we can then be nicer to the system and avoid splitting gigantic pages. Looking forward to when we might be able to swap out PMD size folios without splitting, this will also be a benefit. v2: - Include all orders up to the PMD size (Christian König) v3: - Avoid compilation errors for architectures with special PFN_SHIFTs Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230404200650.11043-3-thomas.hellstrom@linux.intel.com
2023-04-06drm/ttm/pool: Fix ttm_pool_alloc error pathThomas Hellström
When hitting an error, the error path forgot to unmap dma mappings and could call set_pages_wb() on already uncached pages. Fix this by introducing a common ttm_pool_free_range() function that does the right thing. v2: - Simplify that common function (Christian König) v3: - Rename that common function to ttm_pool_free_range() (Christian König) Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Cc: Christian König <christian.koenig@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230404200650.11043-2-thomas.hellstrom@linux.intel.com
2023-04-05mm, treewide: redefine MAX_ORDER sanelyKirill A. Shutemov
MAX_ORDER currently defined as number of orders page allocator supports: user can ask buddy allocator for page order between 0 and MAX_ORDER-1. This definition is counter-intuitive and lead to number of bugs all over the kernel. Change the definition of MAX_ORDER to be inclusive: the range of orders user can ask from buddy allocator is 0..MAX_ORDER now. [kirill@shutemov.name: fix min() warning] Link: https://lkml.kernel.org/r/20230315153800.32wib3n5rickolvh@box [akpm@linux-foundation.org: fix another min_t warning] [kirill@shutemov.name: fixups per Zi Yan] Link: https://lkml.kernel.org/r/20230316232144.b7ic4cif4kjiabws@box.shutemov.name [akpm@linux-foundation.org: fix underlining in docs] Link: https://lore.kernel.org/oe-kbuild-all/202303191025.VRCTk6mP-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230315113133.11326-11-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-06drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h v2Christian König
Merge and cleanup the two headers into a single description of the object API. Also move all the documentation to the implementation and drop unnecessary includes from the header. No functional change. v2: minimal checkpatch.pl cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-4-christian.koenig@amd.com
2022-11-08drm/ttm: optimize pool allocations a bit v2Christian König
If we got a page pool use it as much as possible. If we can't get more pages from the pool allocate as much as possible. Only if that still doesn't work reduce the order and try again. v2: minor cleanups Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221107195808.1873-1-christian.koenig@amd.com
2022-07-03mm: shrinkers: provide shrinkers with namesRoman Gushchin
Currently shrinkers are anonymous objects. For debugging purposes they can be identified by count/scan function names, but it's not always useful: e.g. for superblock's shrinkers it's nice to have at least an idea of to which superblock the shrinker belongs. This commit adds names to shrinkers. register_shrinker() and prealloc_shrinker() functions are extended to take a format and arguments to master a name. In some cases it's not possible to determine a good name at the time when a shrinker is allocated. For such cases shrinker_debugfs_rename() is provided. The expected format is: <subsystem>-<shrinker_type>[:<instance>]-<id> For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair. After this change the shrinker debugfs directory looks like: $ cd /sys/kernel/debug/shrinker/ $ ls dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42 mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43 mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44 rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49 sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13 sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36 sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19 sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10 sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9 sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37 sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38 sb-dax-11 sb-proc-45 sb-tmpfs-35 sb-debugfs-7 sb-proc-46 sb-tmpfs-40 [roman.gushchin@linux.dev: fix build warnings] Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Dave Chinner <dchinner@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2021-09-29drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/Matthew Auld
It covers more than just ttm_bo_type_sg usage, like with say dma-buf, since one other user is userptr in amdgpu, and in the future we might have some more. Hence EXTERNAL is likely a more suitable name. v2(Christian): - Rename these to TTM_TT_FLAGS_* - Fix up all the holes in the flag values Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-15drm/ttm: fix the type mismatch error on sparc64Huang Rui
__fls() on sparc64 return "int", but here it is expected as "unsigned long" (x86). It will cause the build errors because the warning becomes fatal while it is using sparc configuration. As suggested by Linus, it can use min_t instead of min to force the type as "unsigned int". Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210907100302.3684453-1-ray.huang@amd.com
2021-08-27drm/ttm: optimize the pool shrinker a bit v2Christian König
Switch back to using a spinlock again by moving the IOMMU unmap outside of the locked region. This avoids contention especially while freeing pages. v2: Add a comment explaining why we need sync_shrinkers(). Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210820120528.81114-3-christian.koenig@amd.com
2021-03-16Merge tag 'drm-misc-next-2021-03-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.13: UAPI Changes: Cross-subsystem Changes: Core Changes: - %p4cc printk format modifier - atomic: introduce drm_crtc_commit_wait, rework atomic plane state helpers to take the drm_commit_state structure - dma-buf: heaps rework to return a struct dma_buf - simple-kms: Add plate state helpers - ttm: debugfs support, removal of sysfs Driver Changes: - Convert drivers to shadow plane helpers - arc: Move to drm/tiny - ast: cursor plane reworks - gma500: Remove TTM and medfield support - mxsfb: imx8mm support - panfrost: MMU IRQ handling rework - qxl: rework to better handle resources deallocation, locking - sun4i: Add alpha properties for UI and VI layers - vc4: RPi4 CEC support - vmwgfx: doc cleanup Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
2021-03-11drm/ttm: Fix TTM page pool accountingAnthony DeRossi
Freed pages are not subtracted from the allocated_pages counter in ttm_pool_type_fini(), causing a leak in the count on device removal. The next shrinker invocation loops forever trying to free pages that are no longer in the pool: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 3-....: (9998 ticks this GP) idle=54e/1/0x4000000000000000 softirq=434857/434857 fqs=2237 (t=10001 jiffies g=2194533 q=49211) NMI backtrace for cpu 3 CPU: 3 PID: 1034 Comm: kswapd0 Tainted: P O 5.11.0-com #1 Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019 Call Trace: <IRQ> ... </IRQ> sysvec_apic_timer_interrupt+0x77/0x80 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0010:mutex_unlock+0x16/0x20 Code: e7 48 8b 70 10 e8 7a 53 77 ff eb aa e8 43 6c ff ff 0f 1f 00 65 48 8b 14 25 00 6d 01 00 31 c9 48 89 d0 f0 48 0f b1 0f 48 39 c2 <74> 05 e9 e3 fe ff ff c3 66 90 48 8b 47 20 48 85 c0 74 0f 8b 50 10 RSP: 0018:ffffbdb840797be8 EFLAGS: 00000246 RAX: ffff9ff445a41c00 RBX: ffffffffc02a9ef8 RCX: 0000000000000000 RDX: ffff9ff445a41c00 RSI: ffffbdb840797c78 RDI: ffffffffc02a9ac0 RBP: 0000000000000080 R08: 0000000000000000 R09: ffffbdb840797c80 R10: 0000000000000000 R11: fffffffffffffff5 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000084 R15: ffffffffc02a9a60 ttm_pool_shrink+0x7d/0x90 [ttm] ttm_pool_shrinker_scan+0x5/0x20 [ttm] do_shrink_slab+0x13a/0x1a0 ... debugfs shows the incorrect total: $ cat /sys/kernel/debug/dri/0/ttm_page_pool --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- --- 7--- --- 8--- --- 9--- ---10--- wc : 0 0 0 0 0 0 0 0 0 0 0 uc : 0 0 0 0 0 0 0 0 0 0 0 wc 32 : 0 0 0 0 0 0 0 0 0 0 0 uc 32 : 0 0 0 0 0 0 0 0 0 0 0 DMA uc : 0 0 0 0 0 0 0 0 0 0 0 DMA wc : 0 0 0 0 0 0 0 0 0 0 0 DMA : 0 0 0 0 0 0 0 0 0 0 0 total : 3029 of 8244261 Using ttm_pool_type_take() to remove pages from the pool before freeing them correctly accounts for the freed pages. Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Signed-off-by: Anthony DeRossi <ajderossi@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210303011723.22512-1-ajderossi@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-02-11drm/ttm: make sure pool pages are clearedChristian König
The old implementation wasn't consistend on this. But it looks like we depend on this so better bring it back. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210210160549.1462-1-christian.koenig@amd.com
2021-02-09drm/ttm: move memory accounting into vmwgfx v4Christian König
This is just another feature which is only used by VMWGFX, so move it into the driver instead. I've tried to add the accounting sysfs file to the kobject of the drm minor, but I'm not 100% sure if this works as expected. v2: fix typo in KFD and avoid 64bit divide v3: fix init order in VMWGFX v4: use pdev sysfs reference instead of drm Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zack Rusin <zackr@vmware.com> (v3) Tested-by: Nirmoy Das <nirmoy.das@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-2-christian.koenig@amd.com
2021-01-28drm/ttm: Use __GFP_NOWARN for huge pages in ttm_pool_alloc_pageMichel Dänzer
Without __GFP_NOWARN, attempts at allocating huge pages can trigger dmesg splats like below (which are essentially noise, since TTM falls back to normal pages if it can't get a huge one). [ 9556.710241] clinfo: page allocation failure: order:9, mode:0x194dc2(GFP_HIGHUSER|__GFP_RETRY_MAYFAIL|__GFP_NORETRY|__GFP_ZERO|__GFP_NOMEMALLOC), nodemask=(null),cpuset=user.slice,mems_allowed=0 [ 9556.710259] CPU: 1 PID: 470821 Comm: clinfo Tainted: G E 5.10.10+ #4 [ 9556.710264] Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.OR 11/29/2019 [ 9556.710268] Call Trace: [ 9556.710281] dump_stack+0x6b/0x83 [ 9556.710288] warn_alloc.cold+0x7b/0xdf [ 9556.710297] ? __alloc_pages_direct_compact+0x137/0x150 [ 9556.710303] __alloc_pages_slowpath.constprop.0+0xc1b/0xc50 [ 9556.710312] __alloc_pages_nodemask+0x2ec/0x320 [ 9556.710325] ttm_pool_alloc+0x2e4/0x5e0 [ttm] [ 9556.710332] ? kvmalloc_node+0x46/0x80 [ 9556.710341] ttm_tt_populate+0x37/0xe0 [ttm] [ 9556.710350] ttm_bo_handle_move_mem+0x142/0x180 [ttm] [ 9556.710359] ttm_bo_validate+0x11d/0x190 [ttm] [ 9556.710391] ? drm_vma_offset_add+0x2f/0x60 [drm] [ 9556.710399] ttm_bo_init_reserved+0x2a7/0x320 [ttm] [ 9556.710529] amdgpu_bo_do_create+0x1b8/0x500 [amdgpu] [ 9556.710657] ? amdgpu_bo_subtract_pin_size+0x60/0x60 [amdgpu] [ 9556.710663] ? get_page_from_freelist+0x11f9/0x1450 [ 9556.710789] amdgpu_bo_create+0x40/0x270 [amdgpu] [ 9556.710797] ? _raw_spin_unlock+0x16/0x30 [ 9556.710927] amdgpu_gem_create_ioctl+0x123/0x310 [amdgpu] [ 9556.711062] ? amdgpu_gem_force_release+0x150/0x150 [amdgpu] [ 9556.711098] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 9556.711133] drm_ioctl+0x20f/0x3a0 [drm] [ 9556.711267] ? amdgpu_gem_force_release+0x150/0x150 [amdgpu] [ 9556.711276] ? preempt_count_sub+0x9b/0xd0 [ 9556.711404] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 9556.711411] __x64_sys_ioctl+0x83/0xb0 [ 9556.711417] do_syscall_64+0x33/0x80 [ 9556.711421] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: bf9eee249ac2 ("drm/ttm: stop using GFP_TRANSHUGE_LIGHT") Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/416353/
2021-01-20drm/ttm: optimize ttm pool shrinker a bitChristian König
Only initialize the DMA coherent pools if they are used. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/414957/
2021-01-20drm/ttm: add debugfs entry to test pool shrinker v2Christian König
Useful for testing. v2: add fs_reclaim_acquire()/_release() Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/414955/
2021-01-20drm/ttm: add a debugfs file for the global page poolsChristian König
Instead of printing this on the per device pool. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/414956/
2021-01-18drm/ttm: stop using GFP_TRANSHUGE_LIGHTChristian König
The only flag we really need is __GFP_NOMEMALLOC, highmem depends on dma32 and moveable/compound should never be set in the first place. Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/413812/ Link: https://patchwork.freedesktop.org/patch/413964/ Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Reported-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-01-12drm/ttm: make the pool shrinker lock a mutexChristian König
set_pages_wb() might sleep and so we can't do this in an atomic context. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/413409/
2021-01-11drm/ttm: Fix address passed to dma_mapping_error() in ttm_pool_map()Jeremy Cline
check_unmap() is producing a warning about a missing map error check. The return value from dma_map_page() should be checked for an error, not the caller-provided dma_addr. Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/413432/ Signed-off-by: Christian König <christian.koenig@amd.com>
2021-01-07drm/ttm: unexport ttm_pool_init/finiChristian König
Drivers are not supposed to use this directly any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/412156/
2020-12-16drm/ttm: fix unused function warningArnd Bergmann
ttm_pool_type_count() is not used when debugfs is disabled: drivers/gpu/drm/ttm/ttm_pool.c:243:21: error: unused function 'ttm_pool_type_count' [-Werror,-Wunused-function] static unsigned int ttm_pool_type_count(struct ttm_pool_type *pt) Move the definition into the #ifdef block. Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Martin Peres <martin.peres@mupuf.org> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/405695/ Signed-off-by: Christian König <christian.koenig@amd.com>
2020-11-19drm/ttm: fix DMA32 handling in the global page poolChristian König
When we have mixed DMA32 and non DMA32 device in one system it could otherwise happen that the DMA32 device gets pages it can't work with. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/401317/
2020-11-13drm/ttm: fix missing NULL check in the new page poolChristian König
The pool parameter can be NULL if we free through the shrinker. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Martin Peres <martin.peres@mupuf.org> Acked-by: Martin Peres <martin.peres@mupuf.org> Reported-by: Andy Lavr <andy.lavr@gmail.com> Tested-by: Andy Lavr <andy.lavr@gmail.com> Link: https://patchwork.freedesktop.org/patch/399365/
2020-11-04drm/ttm: rework no_retry handling v2Christian König
During eviction we do want to trigger the OOM killer. Only while doing new allocations we should try to avoid that and return -ENOMEM to the application. v2: rename the flag to gfp_retry_mayfail. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/398685/
2020-10-29drm/ttm: new TT backend allocation pool v3Christian König
This replaces the spaghetti code in the two existing page pools. First of all depending on the allocation size it is between 3 (1GiB) and 5 (1MiB) times faster than the old implementation. It makes better use of buddy pages to allow for larger physical contiguous allocations which should result in better TLB utilization at least for amdgpu. Instead of a completely braindead approach of filling the pool with one CPU while another one is trying to shrink it we only give back freed pages. This also results in much less locking contention and a trylock free MM shrinker callback, so we can guarantee that pages are given back to the system when needed. Downside of this is that it takes longer for many small allocations until the pool is filled up. We could address this, but I couldn't find an use case where this actually matters. We also don't bother freeing large chunks of pages any more since the CPU overhead in that path isn't really that important. The sysfs files are replaced with a single module parameter, allowing users to override how many pages should be globally pooled in TTM. This unfortunately breaks the UAPI slightly, but as far as we know nobody ever depended on this. Zeroing memory coming from the pool was handled inconsistently. The alloc_pages() based pool was zeroing it, the dma_alloc_attr() based one wasn't. For now the new implementation isn't zeroing pages from the pool either and only sets the __GFP_ZERO flag when necessary. The implementation has only 768 lines of code compared to the over 2600 of the old one, and also allows for saving quite a bunch of code in the drivers since we don't need specialized handling there any more based on kernel config. Additional to all of that there was a neat bug with IOMMU, coherent DMA mappings and huge pages which is now fixed in the new code as well. v2: make ttm_pool_apply_caching static as reported by the kernel bot, add some more checks v3: fix some more checkpatch.pl warnings Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com> Tested-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/397080/?series=83051&rev=1