<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/kernel/dma/swiotlb.c, branch docs-6.9</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-6.9</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-6.9'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2024-01-19T00:49:34+00:00</updated>
<entry>
<title>Merge tag 'dma-mapping-6.8-2024-01-18' of git://git.infradead.org/users/hch/dma-mapping</title>
<updated>2024-01-19T00:49:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-19T00:49:34+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=17e232b6d2feddd0285e59dbe641c0efe67a5ee6'/>
<id>urn:sha1:17e232b6d2feddd0285e59dbe641c0efe67a5ee6</id>
<content type='text'>
Pull dma-mapping fixes from Christoph Hellwig:

 - fix kerneldoc warnings (Randy Dunlap)

 - better bounds checking in swiotlb (ZhangPeng)

* tag 'dma-mapping-6.8-2024-01-18' of git://git.infradead.org/users/hch/dma-mapping:
  dma-debug: fix kernel-doc warnings
  swiotlb: check alloc_size before the allocation of a new memory pool
</content>
</entry>
<entry>
<title>Merge tag 'dma-mapping-6.8-2024-01-08' of git://git.infradead.org/users/hch/dma-mapping</title>
<updated>2024-01-11T21:46:50+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-11T21:46:50+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=893e2f9eac9ec8d65a5ae2d99656d5e9f7bd76e2'/>
<id>urn:sha1:893e2f9eac9ec8d65a5ae2d99656d5e9f7bd76e2</id>
<content type='text'>
Pull dma-mapping updates from Christoph Hellwig:

 - reduce area lock contention for non-primary IO TLB pools (Petr
   Tesarik)

 - don't store redundant offsets in the dma_ranges stuctures (Robin
   Murphy)

 - clear dev-&gt;dma_mem when freeing per-device pools (Joakim Zhang)

* tag 'dma-mapping-6.8-2024-01-08' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: clear dev-&gt;dma_mem to NULL after freeing it
  swiotlb: reduce area lock contention for non-primary IO TLB pools
  dma-mapping: don't store redundant offsets
</content>
</entry>
<entry>
<title>swiotlb: check alloc_size before the allocation of a new memory pool</title>
<updated>2024-01-09T15:58:36+00:00</updated>
<author>
<name>ZhangPeng</name>
<email>zhangpeng362@huawei.com</email>
</author>
<published>2024-01-09T02:45:47+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=3dc2f209208dddc73ffd1d817081711343de3242'/>
<id>urn:sha1:3dc2f209208dddc73ffd1d817081711343de3242</id>
<content type='text'>
The allocation request for swiotlb contiguous memory greater than
128*2KB cannot be fulfilled because it exceeds the maximum contiguous
memory limit. If the swiotlb memory we allocate is larger than 128*2KB,
swiotlb_find_slots() will still schedule the allocation of a new memory
pool, which will increase memory overhead.

Fix it by adding a check with alloc_size no more than 128*2KB before
scheduling the allocation of a new memory pool in swiotlb_find_slots().

Signed-off-by: ZhangPeng &lt;zhangpeng362@huawei.com&gt;
Reviewed-by: Petr Tesarik &lt;petr.tesarik1@huawei-partners.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER</title>
<updated>2024-01-08T23:27:15+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2023-12-28T14:47:04+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=5e0a760b44417f7cadd79de2204d6247109558a0'/>
<id>urn:sha1:5e0a760b44417f7cadd79de2204d6247109558a0</id>
<content type='text'>
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has
changed the definition of MAX_ORDER to be inclusive.  This has caused
issues with code that was not yet upstream and depended on the previous
definition.

To draw attention to the altered meaning of the define, rename MAX_ORDER
to MAX_PAGE_ORDER.

Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>swiotlb: reduce area lock contention for non-primary IO TLB pools</title>
<updated>2023-12-15T11:32:45+00:00</updated>
<author>
<name>Petr Tesarik</name>
<email>petr.tesarik1@huawei-partners.com</email>
</author>
<published>2023-12-01T12:13:52+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=55c543865b76830f524ce5ce1243266a93d06f84'/>
<id>urn:sha1:55c543865b76830f524ce5ce1243266a93d06f84</id>
<content type='text'>
If multiple areas and multiple IO TLB pools exist, first iterate the
current CPU specific area in all pools. Then move to the next area index.

This is best illustrated by a diagram:

        area 0 |  area 1 | ... | area M |
pool 0    A         B              C
pool 1    D         E
...
pool N    F         G              H

Currently, each pool is searched before moving on to the next pool,
i.e. the search order is A, B ... C, D, E ... F, G ... H. With this patch,
each area is searched in all pools before moving on to the next area,
i.e. the search order is A, D ... F, B, E ... G ... C ... H.

Note that preemption is not disabled, and raw_smp_processor_id() may not
return a stable result, but it is called only once to determine the initial
area index. The search will iterate over all areas eventually, even if the
current task is preempted.

Next, some pools may have less (but not more) areas than default_nareas.
Skip such pools if the distance from the initial area index is greater than
pool-&gt;nareas. This logic ensures that for every pool the search starts in
the initial CPU's own area and never tries any area twice.

To verify performance impact, I booted the kernel with a minimum pool
size ("swiotlb=512,4,force"), so multiple pools get allocated, and I ran
these benchmarks:

- small: single-threaded I/O of 4 KiB blocks,
- big: single-threaded I/O of 64 KiB blocks,
- 4way: 4-way parallel I/O of 4 KiB blocks.

The "var" column in the tables below is the coefficient of variance over 5
runs of the test, the "diff" column is the relative difference against base
in read-write I/O bandwidth (MiB/s).

Tested on an x86 VM against a QEMU virtio SATA driver backed by a RAM-based
block device on the host:

	base	   patched
	var	var	diff
small	0.69%	0.62%	+25.4%
big	2.14%	2.27%	+25.7%
4way	2.65%	1.70%	+23.6%

Tested on a Raspberry Pi against a class-10 A1 microSD card:

	base	   patched
	var	var	diff
small	0.53%	1.96%	-0.3%
big	0.02%	0.57%	+0.8%
4way	6.17%	0.40%	+0.3%

These results confirm that there is significant performance boost in the
software IO TLB slot allocation itself. Where performance is dominated by
actual hardware, there is no measurable change.

Signed-off-by: Petr Tesarik &lt;petr.tesarik1@huawei-partners.com&gt;
Reviewed-by: Mirsad Todorovac &lt;mirsad.todorovac@alu.unizg.hr&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>swiotlb: fix out-of-bounds TLB allocations with CONFIG_SWIOTLB_DYNAMIC</title>
<updated>2023-11-08T15:27:05+00:00</updated>
<author>
<name>Petr Tesarik</name>
<email>petr.tesarik1@huawei-partners.com</email>
</author>
<published>2023-11-08T11:12:49+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=53c87e846e335e3c18044c397cc35178163d7827'/>
<id>urn:sha1:53c87e846e335e3c18044c397cc35178163d7827</id>
<content type='text'>
Limit the free list length to the size of the IO TLB. Transient pool can be
smaller than IO_TLB_SEGSIZE, but the free list is initialized with the
assumption that the total number of slots is a multiple of IO_TLB_SEGSIZE.
As a result, swiotlb_area_find_slots() may allocate slots past the end of
a transient IO TLB buffer.

Reported-by: Niklas Schnelle &lt;schnelle@linux.ibm.com&gt;
Closes: https://lore.kernel.org/linux-iommu/104a8c8fedffd1ff8a2890983e2ec1c26bff6810.camel@linux.ibm.com/
Fixes: 79636caad361 ("swiotlb: if swiotlb is full, fall back to a transient memory pool")
Cc: stable@vger.kernel.org
Signed-off-by: Petr Tesarik &lt;petr.tesarik1@huawei-partners.com&gt;
Reviewed-by: Halil Pasic &lt;pasic@linux.ibm.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>swiotlb: do not free decrypted pages if dynamic</title>
<updated>2023-11-03T08:33:45+00:00</updated>
<author>
<name>Petr Tesarik</name>
<email>petrtesarik@huaweicloud.com</email>
</author>
<published>2023-11-02T09:36:49+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a5e3b127455d073f146a2a4ea3e7117635d34c5c'/>
<id>urn:sha1:a5e3b127455d073f146a2a4ea3e7117635d34c5c</id>
<content type='text'>
Fix these two error paths:

1. When set_memory_decrypted() fails, pages may be left fully or partially
   decrypted.

2. Decrypted pages may be freed if swiotlb_alloc_tlb() determines that the
   physical address is too high.

To fix the first issue, call set_memory_encrypted() on the allocated region
after a failed decryption attempt. If that also fails, leak the pages.

To fix the second issue, check that the TLB physical address is below the
requested limit before decrypting.

Let the caller differentiate between unsuitable physical address (=&gt; retry
from a lower zone) and allocation failures (=&gt; no point in retrying).

Cc: stable@vger.kernel.org
Fixes: 79636caad361 ("swiotlb: if swiotlb is full, fall back to a transient memory pool")
Signed-off-by: Petr Tesarik &lt;petr.tesarik1@huawei-partners.com&gt;
Reviewed-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>Merge tag 'dma-mapping-6.7-2023-10-30' of git://git.infradead.org/users/hch/dma-mapping</title>
<updated>2023-11-01T23:15:54+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-01T23:15:54+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=009fbfc97b6367762efa257f1478ec86d37949f9'/>
<id>urn:sha1:009fbfc97b6367762efa257f1478ec86d37949f9</id>
<content type='text'>
Pull dma-mapping updates from Christoph Hellwig:

 - get rid of the fake support for coherent DMA allocation on coldfire
   with caches (Christoph Hellwig)

 - add a few Kconfig dependencies so that Kconfig catches the use of
   invalid configurations (Christoph Hellwig)

 - fix a type in dma-debug output (Chuck Lever)

 - rewrite a comment in swiotlb (Sean Christopherson)

* tag 'dma-mapping-6.7-2023-10-30' of git://git.infradead.org/users/hch/dma-mapping:
  dma-debug: Fix a typo in a debugging eye-catcher
  swiotlb: rewrite comment explaining why the source is preserved on DMA_FROM_DEVICE
  m68k: remove unused includes from dma.c
  m68k: don't provide arch_dma_alloc for nommu/coldfire
  net: fec: use dma_alloc_noncoherent for data cache enabled coldfire
  m68k: use the coherent DMA code for coldfire without data cache
  dma-direct: warn when coherent allocations aren't supported
  dma-direct: simplify the use atomic pool logic in dma_direct_alloc
  dma-direct: add a CONFIG_ARCH_HAS_DMA_ALLOC symbol
  dma-direct: add dependencies to CONFIG_DMA_GLOBAL_POOL
</content>
</entry>
<entry>
<title>swiotlb: do not try to allocate a TLB bigger than MAX_ORDER pages</title>
<updated>2023-10-25T14:26:20+00:00</updated>
<author>
<name>Petr Tesarik</name>
<email>petr.tesarik1@huawei-partners.com</email>
</author>
<published>2023-10-25T08:44:25+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=d5090484b021794271280ab64d20253883b7f6fd'/>
<id>urn:sha1:d5090484b021794271280ab64d20253883b7f6fd</id>
<content type='text'>
When allocating a new pool at runtime, reduce the number of slabs so
that the allocation order is at most MAX_ORDER.  This avoids a kernel
warning in __alloc_pages().

The warning is relatively benign, because the pool size is subsequently
reduced when allocation fails, but it is silly to start with a request
that is known to fail, especially since this is the default behavior if
the kernel is built with CONFIG_SWIOTLB_DYNAMIC=y and booted without any
swiotlb= parameter.

Reported-by: Ben Greear &lt;greearb@candelatech.com&gt;
Closes: https://lore.kernel.org/netdev/4f173dd2-324a-0240-ff8d-abf5c191be18@candelatech.com/
Fixes: 1aaa736815eb ("swiotlb: allocate a new memory pool when existing pools are full")
Signed-off-by: Petr Tesarik &lt;petr.tesarik1@huawei-partners.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>swiotlb: rewrite comment explaining why the source is preserved on DMA_FROM_DEVICE</title>
<updated>2023-10-23T05:51:36+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2023-10-20T15:42:15+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=1132a1dc053ef4391bb09fdb2242a628615291bb'/>
<id>urn:sha1:1132a1dc053ef4391bb09fdb2242a628615291bb</id>
<content type='text'>
Rewrite the comment explaining why swiotlb copies the original buffer to
the TLB buffer before initiating DMA *from* the device, i.e. before the
device DMAs into the TLB buffer.  The existing comment's argument that
preserving the original data can prevent a kernel memory leak is bogus.

If the driver that triggered the mapping _knows_ that the device will
overwrite the entire mapping, or the driver will consume only the written
parts, then copying from the original memory is completely pointless.

If neither of the above holds true, then copying from the original adds
value only if preserving the data is necessary for functional
correctness, or the driver explicitly initialized the original memory.
If the driver didn't initialize the memory, then copying the original
buffer to the TLB buffer simply changes what kernel data is leaked to
user space.

Writing the entire TLB buffer _does_ prevent leaking stale TLB buffer
data from a previous bounce, but that can be achieved by simply zeroing
the TLB buffer when grabbing a slot.

The real reason swiotlb ended up initializing the TLB buffer with the
original buffer is that it's necessary to make swiotlb operate as
transparently as possible, i.e. to behave as closely as possible to
hardware, and to avoid corrupting the original buffer, e.g. if the driver
knows the device will do partial writes and is relying on the unwritten
data to be preserved.

Reviewed-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Link: https://lore.kernel.org/all/ZN5elYQ5szQndN8n@google.com
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
</feed>
