<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/drivers/gpu/drm/drm_mm.c, branch docs-5.13</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.13</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.13'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2020-06-23T13:46:40+00:00</updated>
<entry>
<title>drm/mm: cleanup and improve next_hole_*_addr()</title>
<updated>2020-06-23T13:46:40+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-06-15T14:16:42+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=5fad79fd66ff90b8c0a95319dad0b099008f8347'/>
<id>urn:sha1:5fad79fd66ff90b8c0a95319dad0b099008f8347</id>
<content type='text'>
Skipping just one branch of the tree is not the most
effective approach.

Instead use a macro to define the traversal functions and
sort out both branch sides.

This improves the performance of the unit tests by
a factor of more than 4.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/370298/
</content>
</entry>
<entry>
<title>drm/mm: optimize find_hole() as well</title>
<updated>2020-06-23T13:46:06+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-06-09T12:47:33+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=271e7decd707accfe6d4965b2c5e60d2d77c8e35'/>
<id>urn:sha1:271e7decd707accfe6d4965b2c5e60d2d77c8e35</id>
<content type='text'>
Abort early if there isn't enough space to allocate from a subtree.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/370297/
</content>
</entry>
<entry>
<title>drm/mm: remove unused rb_hole_size()</title>
<updated>2020-06-23T13:37:27+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-06-08T15:27:01+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=41c0e78aae53099852dab4d1050c06a43c827eb3'/>
<id>urn:sha1:41c0e78aae53099852dab4d1050c06a43c827eb3</id>
<content type='text'>
Just some code cleanup.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/370296/
</content>
</entry>
<entry>
<title>drm/mm: remove invalid entry based optimization</title>
<updated>2020-06-15T08:51:18+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-06-08T13:41:58+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=d2fb716a7abd984c1f34335dbb295629da527baf'/>
<id>urn:sha1:d2fb716a7abd984c1f34335dbb295629da527baf</id>
<content type='text'>
When the current entry is rejected as candidate for the search
it does not mean that we can abort the subtree search.

It is perfectly possible that only the alignment, but not the
size is the reason for the rejection.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/369394/
</content>
</entry>
<entry>
<title>drm/mm: fix hole size comparison</title>
<updated>2020-06-04T07:57:22+00:00</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.aiemd@gmail.com</email>
</author>
<published>2020-05-29T14:04:01+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=18ece75d7d74eb108f6a7325cf247077a666cba8'/>
<id>urn:sha1:18ece75d7d74eb108f6a7325cf247077a666cba8</id>
<content type='text'>
Fixes: 0cdea4455acd350a ("drm/mm: optimize rb_hole_addr rbtree search")

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reported-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/367726/
</content>
</entry>
<entry>
<title>drm/mm: optimize rb_hole_addr rbtree search</title>
<updated>2020-05-05T11:39:38+00:00</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.das@amd.com</email>
</author>
<published>2020-05-04T15:40:35+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=0cdea4455acd350a7f62406478e3d6d1f764cef9'/>
<id>urn:sha1:0cdea4455acd350a7f62406478e3d6d1f764cef9</id>
<content type='text'>
Userspace can severely fragment rb_hole_addr rbtree by manipulating
alignment while allocating buffers. Fragmented rb_hole_addr rbtree
would result in large delays while allocating buffer object for a
userspace application. It takes long time to find suitable hole
because if we fail to find a suitable hole in the first attempt
then we look for neighbouring nodes using rb_prev()/rb_next().
Traversing rbtree using rb_prev()/rb_next() can take really long
time if the tree is fragmented.

This patch improves searches in fragmented rb_hole_addr rbtree by
modifying it to an augmented rbtree which will store an extra field
in drm_mm_node, subtree_max_hole. Each drm_mm_node now stores maximum
hole size for its subtree in drm_mm_node-&gt;subtree_max_hole. Using
drm_mm_node-&gt;subtree_max_hole, it is possible to eliminate a complete
subtree if that subtree is unable to serve a request hence reducing
number of rb_prev()/rb_next() used.

With this patch applied, 1 million bo allocs on amdgpu took ~8 sec,
compared to 50k bo allocs which took 28 sec without it.

partial test code:
int test_fragmentation(void)
{

	int i = 0;
        uint32_t  minor_version;
        uint32_t  major_version;

        struct amdgpu_bo_alloc_request request = {};
        amdgpu_bo_handle vram_handle[MAX_ALLOC] = {};
        amdgpu_device_handle device_handle;

        request.alloc_size = 4096;
        request.phys_alignment = 8192;
        request.preferred_heap = AMDGPU_GEM_DOMAIN_VRAM;

        int fd = open("/dev/dri/card0", O_RDWR | O_CLOEXEC);
        amdgpu_device_initialize(fd, &amp;major_version,  &amp;minor_version,
				 &amp;device_handle);

        for (i = 0; i &lt; MAX_ALLOC; i++) {
                amdgpu_bo_alloc(device_handle, &amp;request, &amp;vram_handle[i]);
        }

        for (i = 0; i &lt; MAX_ALLOC; i++)
                amdgpu_bo_free(vram_handle[i]);

        return 0;
}

v2:
Use RB_DECLARE_CALLBACKS_MAX to maintain subtree_max_hole
v3:
insert_hole_addr() should be static a function
fix return value of next_hole_high_addr()/next_hole_low_addr()
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
v4:
Fix commit message.

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reviewed-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/364341/
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/mm: revert "Break long searches in fragmented address spaces"</title>
<updated>2020-03-31T12:47:51+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-03-30T12:30:41+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=2713778cbfca756eccd754339b785712ef464a98'/>
<id>urn:sha1:2713778cbfca756eccd754339b785712ef464a98</id>
<content type='text'>
This reverts commit 7be1b9b8e9d1e9ef0342d2e001f44eec4030aa4d.

The drm_mm is supposed to work in atomic context, so calling schedule()
or in this case cond_resched() is illegal.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Link: https://patchwork.freedesktop.org/patch/359278/
</content>
</entry>
<entry>
<title>drm/mm: Remove redundant assignment in drm_mm_reserve_node</title>
<updated>2020-03-10T10:25:07+00:00</updated>
<author>
<name>Akeem G Abodunrin</name>
<email>akeem.g.abodunrin@intel.com</email>
</author>
<published>2020-03-09T15:11:56+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=0d1650fa2420ce689cd6cf70523c372d50f8a1ca'/>
<id>urn:sha1:0d1650fa2420ce689cd6cf70523c372d50f8a1ca</id>
<content type='text'>
In Pete Goodliffe words, "You can improve a system by adding new code. You
can also improve a system by removing code" - In this case, commit
"202b52b7fbf70" added new code to initialize end of the node. So, there
is no need for duplicated initialization, and this patch simply removes it.

Signed-off-by: Akeem G Abodunrin &lt;akeem.g.abodunrin@intel.com&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20200309151156.25040-1-akeem.g.abodunrin@intel.com
</content>
</entry>
<entry>
<title>drm/mm: Break long searches in fragmented address spaces</title>
<updated>2020-03-06T11:15:43+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2020-02-07T15:17:20+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=7be1b9b8e9d1e9ef0342d2e001f44eec4030aa4d'/>
<id>urn:sha1:7be1b9b8e9d1e9ef0342d2e001f44eec4030aa4d</id>
<content type='text'>
We try hard to select a suitable hole in the drm_mm first time. But if
that is unsuccessful, we then have to look at neighbouring nodes, and
this requires traversing the rbtree. Walking the rbtree can be slow
(much slower than a linear list for deep trees), and if the drm_mm has
been purposefully fragmented our search can be trapped for a long, long
time. For non-preemptible kernels, we need to break up long CPU bound
sections by manually checking for cond_resched(); similarly we should
also bail out if we have been told to terminate. (In an ideal world, we
would break for any signal, but we need to trade off having to perform
the search again after ERESTARTSYS, which again may form a trap of
making no forward progress.)

Reported-by: Zbigniew Kempczyński &lt;zbigniew.kempczynski@intel.com&gt;
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Zbigniew Kempczyński &lt;zbigniew.kempczynski@intel.com&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Reviewed-by: Andi Shyti &lt;andi.shyti@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20200207151720.2812125-1-chris@chris-wilson.co.uk
</content>
</entry>
<entry>
<title>drm/mm: Use clear_bit_unlock() for releasing the drm_mm_node()</title>
<updated>2019-10-04T12:43:43+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2019-10-03T21:01:00+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=3dda22d3dcd1fc39e7867b2c5f5fc8fa79fdefcc'/>
<id>urn:sha1:3dda22d3dcd1fc39e7867b2c5f5fc8fa79fdefcc</id>
<content type='text'>
A few callers need to serialise the destruction of their drm_mm_node and
ensure it is removed from the drm_mm before freeing. However, to be
completely sure that any access from another thread is complete before
we free the struct, we require the RELEASE semantics of
clear_bit_unlock().

This allows the conditional locking such as

Thread A			Thread B
  mutex_lock(mm_lock);		  if (drm_mm_node_allocated(node)) {
  drm_mm_node_remove(node);	    mutex_lock(mm_lock);
  mutex_unlock(mm_lock);	    if (drm_mm_node_allocated(node))
				      drm_mm_node_remove(node);
				    mutex_unlock(mm_lock);
				  }
				  kfree(node);

to serialise correctly without any lingering accesses from A to the
freed node. Allocation / insertion of the node is assumed never to race
with removal or eviction scanning.

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20191003210100.22250-5-chris@chris-wilson.co.uk
</content>
</entry>
</feed>
