<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/fs, branch v6.5-rc6</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v6.5-rc6</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v6.5-rc6'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2023-08-12T20:28:55+00:00</updated>
<entry>
<title>Merge tag 'for-6.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux</title>
<updated>2023-08-12T20:28:55+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-08-12T20:28:55+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a785fd28d31f76d50004712b6e0b409d5a8239d8'/>
<id>urn:sha1:a785fd28d31f76d50004712b6e0b409d5a8239d8</id>
<content type='text'>
Pull btrfs fixes from David Sterba:
 "More fixes, some of them going back to older releases and there are
  fixes for hangs in stress tests regarding space caching:

   - fixes and progress tracking for hangs in free space caching, found
     by test generic/475

   - writeback fixes, write pages in integrity mode and skip writing
     pages that have been written meanwhile

   - properly clear end of extent range after an error

   - relocation fixes:
      - fix race betwen qgroup tree creation and relocation
      - detect and report invalid reloc roots"

* tag 'for-6.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: set cache_block_group_error if we find an error
  btrfs: reject invalid reloc tree root keys with stack dump
  btrfs: exit gracefully if reloc roots don't match
  btrfs: avoid race between qgroup tree creation and relocation
  btrfs: properly clear end of the unreserved range in cow_file_range
  btrfs: don't wait for writeback on clean pages in extent_write_cache_pages
  btrfs: don't stop integrity writeback too early
  btrfs: wait for actual caching progress during allocation
</content>
</entry>
<entry>
<title>locking: remove spin_lock_prefetch</title>
<updated>2023-08-12T16:18:47+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjguzik@gmail.com</email>
</author>
<published>2023-08-12T16:15:54+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=c8afaa1b0f8bc93d013ab2ea6b9649958af3f1d3'/>
<id>urn:sha1:c8afaa1b0f8bc93d013ab2ea6b9649958af3f1d3</id>
<content type='text'>
The only remaining consumer is new_inode, where it showed up in 2001 as
commit c37fa164f793 ("v2.4.9.9 -&gt; v2.4.9.10") in a historical repo [1]
with a changelog which does not mention it.

Since then the line got only touched up to keep compiling.

While it may have been of benefit back in the day, it is guaranteed to
at best not get in the way in the multicore setting -- as the code
performs *a lot* of work between the prefetch and actual lock acquire,
any contention means the cacheline is already invalid by the time the
routine calls spin_lock().  It adds spurious traffic, for short.

On top of it prefetch is notoriously tricky to use for single-threaded
purposes, making it questionable from the get go.

As such, remove it.

I admit upfront I did not see value in benchmarking this change, but I
can do it if that is deemed appropriate.

Removal from new_inode and of the entire thing are in the same patch as
requested by Linus, so whatever weird looks can be directed at that guy.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/fs/inode.c?id=c37fa164f793735b32aa3f53154ff1a7659e6442 [1]
Signed-off-by: Mateusz Guzik &lt;mjguzik@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'zonefs-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs</title>
<updated>2023-08-12T01:35:56+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-08-12T01:35:56+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=0725a70411bd836231c3e78090ac47cac51246b1'/>
<id>urn:sha1:0725a70411bd836231c3e78090ac47cac51246b1</id>
<content type='text'>
Pull zonefs fix from Damien Le Moal:

 - The switch to using iomap for executing a direct synchronous write to
   sequential files using a zone append BIO overlooked cases where the
   BIO built by iomap is too large and needs splitting, which is not
   allowed with zone append.

   Fix this by using regular write commands instead. The use of zone
   append commands will be reintroduced later with proper support from
   iomap.

* tag 'zonefs-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: fix synchronous direct writes to sequential files
</content>
</entry>
<entry>
<title>Merge tag 'mm-hotfixes-stable-2023-08-11-13-44' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2023-08-11T21:19:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-08-11T21:19:20+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=190bf7b14b0cf3df19c059061be032bd8994a597'/>
<id>urn:sha1:190bf7b14b0cf3df19c059061be032bd8994a597</id>
<content type='text'>
Pull misc fixes from Andrew Morton:
 "14 hotfixes. 11 of these are cc:stable and the remainder address
  post-6.4 issues, or are not considered suitable for -stable
  backporting"

* tag 'mm-hotfixes-stable-2023-08-11-13-44' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/damon/core: initialize damo_filter-&gt;list from damos_new_filter()
  nilfs2: fix use-after-free of nilfs_root in dirtying inodes via iput
  selftests: cgroup: fix test_kmem_basic false positives
  fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions
  MAINTAINERS: add maple tree mailing list
  mm: compaction: fix endless looping over same migrate block
  selftests: mm: ksm: fix incorrect evaluation of parameter
  hugetlb: do not clear hugetlb dtor until allocating vmemmap
  mm: memory-failure: avoid false hwpoison page mapped error info
  mm: memory-failure: fix potential unexpected return value from unpoison_memory()
  mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page
  radix tree test suite: fix incorrect allocation size for pthreads
  crypto, cifs: fix error handling in extract_iter_to_sg()
  zsmalloc: fix races between modifications of fullness and isolated
</content>
</entry>
<entry>
<title>btrfs: set cache_block_group_error if we find an error</title>
<updated>2023-08-10T15:16:45+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2023-08-02T13:20:24+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=92fb94b69c6accf1e49fff699640fa0ce03dc910'/>
<id>urn:sha1:92fb94b69c6accf1e49fff699640fa0ce03dc910</id>
<content type='text'>
We set cache_block_group_error if btrfs_cache_block_group() returns an
error, this is because we could end up not finding space to allocate and
mistakenly return -ENOSPC, and which could then abort the transaction
with the incorrect errno, and in the case of ENOSPC result in a
WARN_ON() that will trip up tests like generic/475.

However there's the case where multiple threads can be racing, one
thread gets the proper error, and the other thread doesn't actually call
btrfs_cache_block_group(), it instead sees -&gt;cached ==
BTRFS_CACHE_ERROR.  Again the result is the same, we fail to allocate
our space and return -ENOSPC.  Instead we need to set
cache_block_group_error to -EIO in this case to make sure that if we do
not make our allocation we get the appropriate error returned back to
the caller.

CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: reject invalid reloc tree root keys with stack dump</title>
<updated>2023-08-10T15:14:42+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2023-08-03T09:20:43+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=6ebcd021c92b8e4b904552e4d87283032100796d'/>
<id>urn:sha1:6ebcd021c92b8e4b904552e4d87283032100796d</id>
<content type='text'>
[BUG]
Syzbot reported a crash that an ASSERT() got triggered inside
prepare_to_merge().

That ASSERT() makes sure the reloc tree is properly pointed back by its
subvolume tree.

[CAUSE]
After more debugging output, it turns out we had an invalid reloc tree:

  BTRFS error (device loop1): reloc tree mismatch, root 8 has no reloc root, expect reloc root key (-8, 132, 8) gen 17

Note the above root key is (TREE_RELOC_OBJECTID, ROOT_ITEM,
QUOTA_TREE_OBJECTID), meaning it's a reloc tree for quota tree.

But reloc trees can only exist for subvolumes, as for non-subvolume
trees, we just COW the involved tree block, no need to create a reloc
tree since those tree blocks won't be shared with other trees.

Only subvolumes tree can share tree blocks with other trees (thus they
have BTRFS_ROOT_SHAREABLE flag).

Thus this new debug output proves my previous assumption that corrupted
on-disk data can trigger that ASSERT().

[FIX]
Besides the dedicated fix and the graceful exit, also let tree-checker to
check such root keys, to make sure reloc trees can only exist for subvolumes.

CC: stable@vger.kernel.org # 5.15+
Reported-by: syzbot+ae97a827ae1c3336bbb4@syzkaller.appspotmail.com
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: exit gracefully if reloc roots don't match</title>
<updated>2023-08-10T15:13:13+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2023-08-03T09:20:42+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=05d7ce504545f7874529701664c90814ca645c5d'/>
<id>urn:sha1:05d7ce504545f7874529701664c90814ca645c5d</id>
<content type='text'>
[BUG]
Syzbot reported a crash that an ASSERT() got triggered inside
prepare_to_merge().

[CAUSE]
The root cause of the triggered ASSERT() is we can have a race between
quota tree creation and relocation.

This leads us to create a duplicated quota tree in the
btrfs_read_fs_root() path, and since it's treated as fs tree, it would
have ROOT_SHAREABLE flag, causing us to create a reloc tree for it.

The bug itself is fixed by a dedicated patch for it, but this already
taught us the ASSERT() is not something straightforward for
developers.

[ENHANCEMENT]
Instead of using an ASSERT(), let's handle it gracefully and output
extra info about the mismatch reloc roots to help debug.

Also with the above ASSERT() removed, we can trigger ASSERT(0)s inside
merge_reloc_roots() later.
Also replace those ASSERT(0)s with WARN_ON()s.

CC: stable@vger.kernel.org # 5.15+
Reported-by: syzbot+ae97a827ae1c3336bbb4@syzkaller.appspotmail.com
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: avoid race between qgroup tree creation and relocation</title>
<updated>2023-08-10T15:10:10+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2023-08-03T09:20:41+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=773e722a98e25caf96f91aced7070c1858250ba2'/>
<id>urn:sha1:773e722a98e25caf96f91aced7070c1858250ba2</id>
<content type='text'>
[BUG]
Syzbot reported a weird ASSERT() triggered inside prepare_to_merge().

  assertion failed: root-&gt;reloc_root == reloc_root, in fs/btrfs/relocation.c:1919
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/relocation.c:1919!
  invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  CPU: 0 PID: 9904 Comm: syz-executor.3 Not tainted
  6.4.0-syzkaller-08881-g533925cb7604 #0
  Hardware name: Google Google Compute Engine/Google Compute Engine,
  BIOS Google 05/27/2023
  RIP: 0010:prepare_to_merge+0xbb2/0xc40 fs/btrfs/relocation.c:1919
  Code: fe e9 f5 (...)
  RSP: 0018:ffffc9000325f760 EFLAGS: 00010246
  RAX: 000000000000004f RBX: ffff888075644030 RCX: 1481ccc522da5800
  RDX: ffffc90005c09000 RSI: 00000000000364ca RDI: 00000000000364cb
  RBP: ffffc9000325f870 R08: ffffffff816f33ac R09: 1ffff9200064bea0
  R10: dffffc0000000000 R11: fffff5200064bea1 R12: ffff888075644000
  R13: ffff88803b166000 R14: ffff88803b166560 R15: ffff88803b166558
  FS:  00007f4e305fd700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000056080679c000 CR3: 00000000193ad000 CR4: 00000000003506f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   &lt;TASK&gt;
   relocate_block_group+0xa5d/0xcd0 fs/btrfs/relocation.c:3749
   btrfs_relocate_block_group+0x7ab/0xd70 fs/btrfs/relocation.c:4087
   btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3283
   __btrfs_balance+0x1b06/0x2690 fs/btrfs/volumes.c:4018
   btrfs_balance+0xbdb/0x1120 fs/btrfs/volumes.c:4402
   btrfs_ioctl_balance+0x496/0x7c0 fs/btrfs/ioctl.c:3604
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:870 [inline]
   __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:856
   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
   do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
  RIP: 0033:0x7f4e2f88c389

[CAUSE]
With extra debugging, the offending reloc_root is for quota tree (rootid 8).

Normally we should not use the reloc tree for quota root at all, as reloc
trees are only for subvolume trees.

But there is a race between quota enabling and relocation, this happens
after commit 85724171b302 ("btrfs: fix the btrfs_get_global_root return value").

Before that commit, for quota and free space tree, we exit immediately
if we cannot grab it from fs_info.

But now we would try to read it from disk, just as if they are fs trees,
this sets ROOT_SHAREABLE flags in such race:

             Thread A             |           Thread B
 ---------------------------------+------------------------------
 btrfs_quota_enable()             |
 |                                | btrfs_get_root_ref()
 |                                | |- btrfs_get_global_root()
 |                                | |  Returned NULL
 |                                | |- btrfs_lookup_fs_root()
 |                                | |  Returned NULL
 |- btrfs_create_tree()           | |
 |  Now quota root item is        | |
 |  inserted                      | |- btrfs_read_tree_root()
 |                                | |  Got the newly inserted quota root
 |                                | |- btrfs_init_fs_root()
 |                                | |  Set ROOT_SHAREABLE flag

[FIX]
Get back to the old behavior by returning PTR_ERR(-ENOENT) if the target
objectid is not a subvolume tree or data reloc tree.

Reported-and-tested-by: syzbot+ae97a827ae1c3336bbb4@syzkaller.appspotmail.com
Fixes: 85724171b302 ("btrfs: fix the btrfs_get_global_root return value")
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: properly clear end of the unreserved range in cow_file_range</title>
<updated>2023-08-10T15:07:10+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2023-07-24T13:26:55+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=12b2d64e591652a2d97dd3afa2b062ca7a4ba352'/>
<id>urn:sha1:12b2d64e591652a2d97dd3afa2b062ca7a4ba352</id>
<content type='text'>
When the call to btrfs_reloc_clone_csums in cow_file_range returns an
error, we jump to the out_unlock label with the extent_reserved variable
set to false.   The cleanup at the label will then call
extent_clear_unlock_delalloc on the range from start to end.  But we've
already added cur_alloc_size to start before the jump, so there might no
range be left from the newly incremented start to end.  Move the check for
'start &lt; end' so that it is reached by also for the !extent_reserved case.

CC: stable@vger.kernel.org # 6.1+
Fixes: a315e68f6e8b ("Btrfs: fix invalid attempt to free reserved space on failure to cow range")
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: don't wait for writeback on clean pages in extent_write_cache_pages</title>
<updated>2023-08-10T15:04:09+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2023-07-24T13:26:54+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=5c25699871112853f231e52d51c576d5c759a020'/>
<id>urn:sha1:5c25699871112853f231e52d51c576d5c759a020</id>
<content type='text'>
__extent_writepage could have started on more pages than the one it was
called for.  This happens regularly for zoned file systems, and in theory
could happen for compressed I/O if the worker thread was executed very
quickly. For such pages extent_write_cache_pages waits for writeback
to complete before moving on to the next page, which is highly inefficient
as it blocks the flusher thread.

Port over the PageDirty check that was added to write_cache_pages in
commit 515f4a037fb ("mm: write_cache_pages optimise page cleaning") to
fix this.

CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
</feed>
