<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/fs/xfs/xfs_inode.h, branch docs-5.6</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.6</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.6'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2019-11-13T19:13:45+00:00</updated>
<entry>
<title>xfs: merge the projid fields in struct xfs_icdinode</title>
<updated>2019-11-13T19:13:45+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2019-11-12T16:22:54+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=de7a866fd41b227b0aa6e9cbeb0dae221c12f542'/>
<id>urn:sha1:de7a866fd41b227b0aa6e9cbeb0dae221c12f542</id>
<content type='text'>
There is no point in splitting the fields like this in an purely
in-memory structure.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
<entry>
<title>xfs: remove the now unused dir ops infrastructure</title>
<updated>2019-11-11T00:54:24+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2019-11-08T23:06:02+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=957ee13e204a5ffe814139aa89e62eece4b969fd'/>
<id>urn:sha1:957ee13e204a5ffe814139aa89e62eece4b969fd</id>
<content type='text'>
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
<entry>
<title>xfs: add a xfs_inode_buftarg helper</title>
<updated>2019-10-28T15:37:54+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2019-10-25T05:25:38+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=30fa529e3b2e6f1da277ef8525e4ce7979c57c57'/>
<id>urn:sha1:30fa529e3b2e6f1da277ef8525e4ce7979c57c57</id>
<content type='text'>
Add a new xfs_inode_buftarg helper that gets the data I/O buftarg for a
given inode.  Replace the existing xfs_find_bdev_for_inode and
xfs_find_daxdev_for_inode helpers with this new general one and cleanup
some of the callers.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
<entry>
<title>xfs: widen inode delalloc block counter to 64-bits</title>
<updated>2019-04-23T15:36:23+00:00</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2019-04-17T23:30:24+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=394aafdc15da01c049f580e097806ee12ad6900e'/>
<id>urn:sha1:394aafdc15da01c049f580e097806ee12ad6900e</id>
<content type='text'>
Widen the incore inode's i_delayed_blks counter to be a 64-bit integer.
This is necessary to fix an integer overflow problem that can be
reproduced easily now that we use the counter to track blocks that are
assigned to the inode in memory but not on disk.  This includes actual
delalloc reservations as well as real extents in the COW fork that
are waiting to be remapped into the data fork.

These 'delayed mapping' blocks can easily exceed 2^32 blocks if one
creates a very large sparse file of size approximately 2^33 bytes with
one byte written every 2^23 bytes, sets a very large COW extent size
hint of 2^23 blocks, reflinks the first file into a second file, and
then writes a single byte every 2^23 blocks in the original file.

When this happens, we'll try to create approximately 1024 2^23 extent
reservations in the COW fork, which will overflow the counter and cause
problems.

Note that on x64 we end up filling a 4-byte gap in the structure so this
doesn't increase the incore size.

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Reviewed-by: Allison Collins &lt;allison.henderson@oracle.com&gt;
Reviewed-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>xfs: implement per-inode writeback completion queues</title>
<updated>2019-04-16T17:01:57+00:00</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2019-04-15T20:13:20+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=cb357bf3d105f68ff5a5adcf89f1b285da675e2f'/>
<id>urn:sha1:cb357bf3d105f68ff5a5adcf89f1b285da675e2f</id>
<content type='text'>
When scheduling writeback of dirty file data in the page cache, XFS uses
IO completion workqueue items to ensure that filesystem metadata only
updates after the write completes successfully.  This is essential for
converting unwritten extents to real extents at the right time and
performing COW remappings.

Unfortunately, XFS queues each IO completion work item to an unbounded
workqueue, which means that the kernel can spawn dozens of threads to
try to handle the items quickly.  These threads need to take the ILOCK
to update file metadata, which results in heavy ILOCK contention if a
large number of the work items target a single file, which is
inefficient.

Worse yet, the writeback completion threads get stuck waiting for the
ILOCK while holding transaction reservations, which can use up all
available log reservation space.  When that happens, metadata updates to
other parts of the filesystem grind to a halt, even if the filesystem
could otherwise have handled it.

Even worse, if one of the things grinding to a halt happens to be a
thread in the middle of a defer-ops finish holding the same ILOCK and
trying to obtain more log reservation having exhausted the permanent
reservation, we now have an ABBA deadlock - writeback completion has a
transaction reserved and wants the ILOCK, and someone else has the ILOCK
and wants a transaction reservation.

Therefore, we create a per-inode writeback io completion queue + work
item.  When writeback finishes, it can add the ioend to the per-inode
queue and let the single worker item process that queue.  This
dramatically cuts down on the number of kworkers and ILOCK contention in
the system, and seems to have eliminated an occasional deadlock I was
seeing while running generic/476.

Testing with a program that simulates a heavy random-write workload to a
single file demonstrates that the number of kworkers drops from
approximately 120 threads per file to 1, without dramatically changing
write bandwidth or pagecache access latency.

Note that we leave the xfs-conv workqueue's max_active alone because we
still want to be able to run ioend processing for as many inodes as the
system can handle.

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Reviewed-by: Brian Foster &lt;bfoster@redhat.com&gt;
</content>
</entry>
<entry>
<title>xfs: track metadata health status</title>
<updated>2019-04-15T01:15:57+00:00</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2019-04-12T14:40:25+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=6772c1f11206f270af56d62bc26737864a63608a'/>
<id>urn:sha1:6772c1f11206f270af56d62bc26737864a63608a</id>
<content type='text'>
Add the necessary in-core metadata fields to keep track of which parts
of the filesystem have been observed and which parts were observed to be
unhealthy, and print a warning at unmount time if we have unfixed
problems.

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Reviewed-by: Brian Foster &lt;bfoster@redhat.com&gt;
</content>
</entry>
<entry>
<title>xfs: cache unlinked pointers in an rhashtable</title>
<updated>2019-02-12T00:07:01+00:00</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2019-02-07T18:37:16+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=9b2471797942a5947664818cfe2c6de93b43f37a'/>
<id>urn:sha1:9b2471797942a5947664818cfe2c6de93b43f37a</id>
<content type='text'>
Use a rhashtable to cache the unlinked list incore.  This should speed
up unlinked processing considerably when there are a lot of inodes on
the unlinked list because iunlink_remove no longer has to traverse an
entire bucket list to find which inode points to the one being removed.

The incore list structure records "X.next_unlinked = Y" relations, with
the rhashtable using Y to index the records.  This makes finding the
inode X that points to a inode Y very quick.  If our cache fails to find
anything we can always fall back on the old method.

FWIW this drastically reduces the amount of time it takes to remove
inodes from the unlinked list.  I wrote a program to open a lot of
O_TMPFILE files and then close them in the same order, which takes
a very long time if we have to traverse the unlinked lists.  With the
ptach, I see:

+ /d/t/tmpfile/tmpfile
Opened 193531 files in 6.33s.
Closed 193531 files in 5.86s

real    0m12.192s
user    0m0.064s
sys     0m11.619s
+ cd /
+ umount /mnt

real    0m0.050s
user    0m0.004s
sys     0m0.030s

And without the patch:

+ /d/t/tmpfile/tmpfile
Opened 193588 files in 6.35s.
Closed 193588 files in 751.61s

real    12m38.853s
user    0m0.084s
sys     12m34.470s
+ cd /
+ umount /mnt

real    0m0.086s
user    0m0.000s
sys     0m0.060s

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Reviewed-by: Brian Foster &lt;bfoster@redhat.com&gt;
</content>
</entry>
<entry>
<title>xfs: fold dfops into the transaction</title>
<updated>2018-08-03T06:05:14+00:00</updated>
<author>
<name>Brian Foster</name>
<email>bfoster@redhat.com</email>
</author>
<published>2018-08-01T14:20:35+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=9d9e6233859706875c392707efd6d516cfb764fb'/>
<id>urn:sha1:9d9e6233859706875c392707efd6d516cfb764fb</id>
<content type='text'>
struct xfs_defer_ops has now been reduced to a single list_head. The
external dfops mechanism is unused and thus everywhere a (permanent)
transaction is accessible the associated dfops structure is as well.

Remove the xfs_defer_ops structure and fold the list_head into the
transaction. Also remove the last remnant of external dfops in
xfs_trans_dup().

Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
<entry>
<title>xfs: introduce a new xfs_inode_has_cow_data helper</title>
<updated>2018-07-30T14:57:48+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-07-17T23:51:51+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=51d626903083f7bd651d38b031775740ed41758c'/>
<id>urn:sha1:51d626903083f7bd651d38b031775740ed41758c</id>
<content type='text'>
We have a few places that already check if an inode has actual data in
the COW fork to avoid work on reflink inodes that do not actually have
outstanding COW blocks.  There are a few more places that can avoid
working if doing the same check, so add a documented helper for this
condition and use it in all places where it makes sense.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
<entry>
<title>xfs: remove the xfs_ifork_t typedef</title>
<updated>2018-07-30T14:57:48+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-07-17T23:51:50+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=3ba738df25239f877f6a98ce1cc925fa7e924cd3'/>
<id>urn:sha1:3ba738df25239f877f6a98ce1cc925fa7e924cd3</id>
<content type='text'>
We only have a few more callers left, so seize the opportunity and kill
it off.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
</feed>
