<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/fs/btrfs/delayed-ref.c, branch docs-next</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-next</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-next'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2026-04-07T16:55:55+00:00</updated>
<entry>
<title>btrfs: zoned: cap delayed refs metadata reservation to avoid overcommit</title>
<updated>2026-04-07T16:55:55+00:00</updated>
<author>
<name>Johannes Thumshirn</name>
<email>johannes.thumshirn@wdc.com</email>
</author>
<published>2026-02-10T11:04:21+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=7bcb04de982ff0718870112ad9f38c35cbca528b'/>
<id>urn:sha1:7bcb04de982ff0718870112ad9f38c35cbca528b</id>
<content type='text'>
On zoned filesystems metadata space accounting can become overly optimistic
due to delayed refs reservations growing without a hard upper bound.

The delayed_refs_rsv block reservation is allowed to speculatively grow and
is only backed by actual metadata space when refilled. On zoned devices this
can result in delayed_refs_rsv reserving a large portion of metadata space
that is already effectively unusable due to zone write pointer constraints.
As a result, space_info-&gt;may_use can grow far beyond the usable metadata
capacity, causing the allocator to believe space is available when it is not.

This leads to premature ENOSPC failures and "cannot satisfy tickets" reports
even though commits would be able to make progress by flushing delayed refs.

Analysis of "-o enospc_debug" dumps using a Python debug script
confirmed that delayed_refs_rsv was responsible for the majority of
metadata overcommit on zoned devices. By correlating space_info counters
(total, used, may_use, zone_unusable) across transactions, the analysis
showed that may_use continued to grow even after usable metadata space
was exhausted, with delayed refs refills accounting for the excess
reservations.

Here's the output of the analysis:

  ======================================================================
  Space Type: METADATA
  ======================================================================

  Raw Values:
    Total:                256.00 MB (268435456 bytes)
    Used:                 128.00 KB (131072 bytes)
    Pinned:                16.00 KB (16384 bytes)
    Reserved:             144.00 KB (147456 bytes)
    May Use:              255.48 MB (267894784 bytes)
    Zone Unusable:        192.00 KB (196608 bytes)

  Calculated Metrics:
    Actually Usable:       255.81 MB (total - zone_unusable)
    Committed:             255.77 MB (used + pinned + reserved + may_use)
    Consumed:              320.00 KB (used + zone_unusable)

  Percentages:
    Zone Unusable:    0.07% of total
    May Use:         99.80% of total

Fix this by adding a zoned-specific cap in btrfs_delayed_refs_rsv_refill():
Before reserving additional metadata bytes, limit the delayed refs
reservation based on the usable metadata space (total bytes minus
zone_unusable). If the reservation would exceed this cap, return -EAGAIN
to trigger the existing flush/commit logic instead of overcommitting
metadata space.

This preserves the existing reservation and flushing semantics while
preventing metadata overcommit on zoned devices. The change is limited to
metadata space and does not affect non-zoned filesystems.

This patch addresses premature metadata ENOSPC conditions on zoned devices
and ensures delayed refs are throttled before exhausting usable metadata.

Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>btrfs: remove fs_info argument from btrfs_reserve_metadata_bytes()</title>
<updated>2025-11-24T20:59:11+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2025-10-13T17:27:16+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a1359d06d7878db4ac28d9c5134bc9771e56833d'/>
<id>urn:sha1:a1359d06d7878db4ac28d9c5134bc9771e56833d</id>
<content type='text'>
We don't need it since we can grab fs_info from the given space_info.
So remove the fs_info argument.

Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Anand Jain &lt;asj@kernel.org&gt;
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: fix double free of qgroup record after failure to add delayed ref head</title>
<updated>2025-11-24T20:37:36+00:00</updated>
<author>
<name>Miquel Sabaté Solà</name>
<email>mssola@mssola.com</email>
</author>
<published>2025-10-01T18:05:03+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=725e46298876a2cc1f1c3fb22ba69d29102c3ddf'/>
<id>urn:sha1:725e46298876a2cc1f1c3fb22ba69d29102c3ddf</id>
<content type='text'>
In the previous code it was possible to incur into a double kfree()
scenario when calling add_delayed_ref_head(). This could happen if the
record was reported to already exist in the
btrfs_qgroup_trace_extent_nolock() call, but then there was an error
later on add_delayed_ref_head(). In this case, since
add_delayed_ref_head() returned an error, the caller went to free the
record. Since add_delayed_ref_head() couldn't set this kfree'd pointer
to NULL, then kfree() would have acted on a non-NULL 'record' object
which was pointing to memory already freed by the callee.

The problem comes from the fact that the responsibility to kfree the
object is on both the caller and the callee at the same time. Hence, the
fix for this is to shift the ownership of the 'qrecord' object out of
the add_delayed_ref_head(). That is, we will never attempt to kfree()
the given object inside of this function, and will expect the caller to
act on the 'qrecord' object on its own. The only exception where the
'qrecord' object cannot be kfree'd is if it was inserted into the
tracing logic, for which we already have the 'qrecord_inserted_ret'
boolean to account for this. Hence, the caller has to kfree the object
only if add_delayed_ref_head() reports not to have inserted it on the
tracing logic.

As a side-effect of the above, we must guarantee that
'qrecord_inserted_ret' is properly initialized at the start of the
function, not at the end, and then set when an actual insert
happens. This way we avoid 'qrecord_inserted_ret' having an invalid
value on an early exit.

The documentation from the add_delayed_ref_head() has also been updated
to reflect on the exact ownership of the 'qrecord' object.

Fixes: 6ef8fbce0104 ("btrfs: fix missing error handling when adding delayed ref with qgroups enabled")
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Miquel Sabaté Solà &lt;mssola@mssola.com&gt;
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: annotate btrfs_is_testing() as unlikely and make it return bool</title>
<updated>2025-09-23T06:49:24+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2025-09-19T08:55:12+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=db524fd9802f7ec3281a7f1fdae66c67c9525ba3'/>
<id>urn:sha1:db524fd9802f7ec3281a7f1fdae66c67c9525ba3</id>
<content type='text'>
We can annotate btrfs_is_testing() as unlikely since that's the most
expected scenario and it's desirable for the compiler to optimize for
the case we are not running the self tests. So add the annotation to
btrfs_is_testing() and while at it also make it return bool instead of
int.

Also make two of the existing callers use btrfs_is_testing() directly
instead of storing its result in a local variable.

On x86_64 with Debian's gcc 14.2.0-19 this resulted in a very tiny object
code reduction.

Before this change:

  $ size fs/btrfs/btrfs.ko
     text	   data	    bss	    dec	    hex	filename
  1913263	 161567	  15592	2090422	 1fe5b6	fs/btrfs/btrfs.ko

After this change:

  $ size fs/btrfs/btrfs.ko
     text	   data	    bss	    dec	    hex	filename
  1913257	 161567	  15592	2090416	 1fe5b0	fs/btrfs/btrfs.ko

Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: fix typos in comments and strings</title>
<updated>2025-09-23T06:49:16+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.com</email>
</author>
<published>2025-08-21T22:57:42+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=17dc82dc1e77a6fce07252ce894748190d1487d0'/>
<id>urn:sha1:17dc82dc1e77a6fce07252ce894748190d1487d0</id>
<content type='text'>
Annual typo fixing pass. Strangely codespell found only about 30% of
what is in this patch, the rest was done manually using text
spellchecker with a custom dictionary of acceptable terms.

Reviewed-by: Neal Gompa &lt;neal@gompa.dev&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: move ref-verify under CONFIG_BTRFS_DEBUG</title>
<updated>2025-09-22T08:54:32+00:00</updated>
<author>
<name>Leo Martins</name>
<email>loemra.dev@gmail.com</email>
</author>
<published>2025-08-12T23:28:27+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=cba7c35fec267188a9708deae857e9116c57497b'/>
<id>urn:sha1:cba7c35fec267188a9708deae857e9116c57497b</id>
<content type='text'>
Remove CONFIG_BTRFS_FS_REF_VERIFY Kconfig and add it as part of
CONFIG_BTRFS_DEBUG. This should not be impactful to the performance
of debug. The struct btrfs_ref takes an additional u64, btrfs_fs_info
takes an additional spinlock_t and rb_root. All of the ref_verify logic
is still protected by a mount option.

Signed-off-by: Leo Martins &lt;loemra.dev@gmail.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: add btrfs prefix to is_fstree() and make it return bool</title>
<updated>2025-07-21T21:58:04+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2025-06-23T12:13:23+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=fd00922abc07d01bb4c5b71a6622fe0030855f22'/>
<id>urn:sha1:fd00922abc07d01bb4c5b71a6622fe0030855f22</id>
<content type='text'>
This is an exported function and therefore it should have a 'btrfs_'
prefix, to make it clear it's btrfs specific, avoid future name collisions
with code outside btrfs, and make its naming consistent with most other
btrfs exported functions.

So add a 'btrfs_' prefix to it and make it return bool instead of int,
since all we need is to return true or false.

Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: simplify return logic from btrfs_delayed_ref_init()</title>
<updated>2025-05-15T12:30:46+00:00</updated>
<author>
<name>Yangtao Li</name>
<email>frank.li@vivo.com</email>
</author>
<published>2025-04-14T12:52:31+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=ea2a8bacb103f0efcc0693c22fc72f310fe056ef'/>
<id>urn:sha1:ea2a8bacb103f0efcc0693c22fc72f310fe056ef</id>
<content type='text'>
Make this simpler by returning directly when there's no other cleanup
needed.

Signed-off-by: Yangtao Li &lt;frank.li@vivo.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: use rb_entry_safe() where possible to simplify code</title>
<updated>2025-05-15T12:30:40+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.com</email>
</author>
<published>2025-03-27T16:19:18+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=6aa79c4f25197cc54479dc87d79ecd45571fb062'/>
<id>urn:sha1:6aa79c4f25197cc54479dc87d79ecd45571fb062</id>
<content type='text'>
Simplify conditionally reading an rb_entry(), there's the
rb_entry_safe() helper that checks the node pointer for NULL so we don't
have to write it explicitly.

Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
</feed>
