<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/fs/xfs/xfs_alloc.h, branch docs-6.8</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-6.8</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-6.8'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2014-06-25T04:57:36+00:00</updated>
<entry>
<title>libxfs: move header files</title>
<updated>2014-06-25T04:57:36+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2014-06-25T04:57:36+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=84be0ffc9043f7c56044294eb775a2200452c76d'/>
<id>urn:sha1:84be0ffc9043f7c56044294eb775a2200452c76d</id>
<content type='text'>
Move all the header files that are shared with userspace into
libxfs. This is done as one big chunk simpy to get it done quickly.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: Dave Chinner &lt;david@fromorbit.com&gt;

</content>
</entry>
<entry>
<title>xfs: create a shared header file for format-related information</title>
<updated>2013-10-23T19:11:30+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2013-10-22T23:36:05+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=70a9883c5f34b215b8a77665cefd0398edc5a9ef'/>
<id>urn:sha1:70a9883c5f34b215b8a77665cefd0398edc5a9ef</id>
<content type='text'>
All of the buffer operations structures are needed to be exported
for xfs_db, so move them all to a common location rather than
spreading them all over the place. They are verifying the on-disk
format, so while xfs_format.h might be a good place, it is not part
of the on disk format.

Hence we need to create a new header file that we centralise these
related definitions. Start by moving the bffer operations
structures, and then also move all the other definitions that have
crept into xfs_log_format.h and xfs_format.h as there was no other
shared header file to put them in.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

</content>
</entry>
<entry>
<title>xfs: convert buffer verifiers to an ops structure.</title>
<updated>2012-11-16T03:35:12+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2012-11-14T06:54:40+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=1813dd64057490e7a0678a885c4fe6d02f78bdc1'/>
<id>urn:sha1:1813dd64057490e7a0678a885c4fe6d02f78bdc1</id>
<content type='text'>
To separate the verifiers from iodone functions and associate read
and write verifiers at the same time, introduce a buffer verifier
operations structure to the xfs_buf.

This avoids the need for assigning the write verifier, clearing the
iodone function and re-running ioend processing in the read
verifier, and gets rid of the nasty "b_pre_io" name for the write
verifier function pointer. If we ever need to, it will also be
easier to add further content specific callbacks to a buffer with an
ops structure in place.

We also avoid needing to export verifier functions, instead we
can simply export the ops structures for those that are needed
outside the function they are defined in.

This patch also fixes a directory block readahead verifier issue
it exposed.

This patch also adds ops callbacks to the inode/alloc btree blocks
initialised by growfs. These will need more work before they will
work with CRCs.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Phil White &lt;pwhite@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;
</content>
</entry>
<entry>
<title>xfs: connect up write verifiers to new buffers</title>
<updated>2012-11-16T03:35:09+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2012-11-14T06:53:49+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=b0f539de9fcc543a3ffa40bc22bf51aca6ea6183'/>
<id>urn:sha1:b0f539de9fcc543a3ffa40bc22bf51aca6ea6183</id>
<content type='text'>
Metadata buffers that are read from disk have write verifiers
already attached to them, but newly allocated buffers do not. Add
appropriate write verifiers to all new metadata buffers.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Ben Myers &lt;bpm@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;
</content>
</entry>
<entry>
<title>xfs: move allocation stack switch up to xfs_bmapi_allocate</title>
<updated>2012-10-18T22:42:48+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2012-10-05T01:06:59+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=e04426b9202bccd4cfcbc70b2fa2aeca1c86d8f5'/>
<id>urn:sha1:e04426b9202bccd4cfcbc70b2fa2aeca1c86d8f5</id>
<content type='text'>
Switching stacks are xfs_alloc_vextent can cause deadlocks when we
run out of worker threads on the allocation workqueue. This can
occur because xfs_bmap_btalloc can make multiple calls to
xfs_alloc_vextent() and even if xfs_alloc_vextent() fails it can
return with the AGF locked in the current allocation transaction.

If we then need to make another allocation, and all the allocation
worker contexts are exhausted because the are blocked waiting for
the AGF lock, holder of the AGF cannot get it's xfs-alloc_vextent
work completed to release the AGF.  Hence allocation effectively
deadlocks.

To avoid this, move the stack switch one layer up to
xfs_bmapi_allocate() so that all of the allocation attempts in a
single switched stack transaction occur in a single worker context.
This avoids the problem of an allocation being blocked waiting for
a worker thread whilst holding the AGF.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Mark Tinguely &lt;tinguely@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

</content>
</entry>
<entry>
<title>xfs: introduce XFS_BMAPI_STACK_SWITCH</title>
<updated>2012-10-18T22:41:56+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2012-10-05T01:06:58+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=2455881c0b52f87be539c4c7deab1afff4d8a560'/>
<id>urn:sha1:2455881c0b52f87be539c4c7deab1afff4d8a560</id>
<content type='text'>
Certain allocation paths through xfs_bmapi_write() are in situations
where we have limited stack available. These are almost always in
the buffered IO writeback path when convertion delayed allocation
extents to real extents.

The current stack switch occurs for userdata allocations, which
means we also do stack switches for preallocation, direct IO and
unwritten extent conversion, even those these call chains have never
been implicated in a stack overrun.

Hence, let's target just the single stack overun offended for stack
switches. To do that, introduce a XFS_BMAPI_STACK_SWITCH flag that
the caller can pass xfs_bmapi_write() to indicate it should switch
stacks if it needs to do allocation.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Mark Tinguely &lt;tinguely@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

</content>
</entry>
<entry>
<title>xfs: move busy extent handling to it's own file</title>
<updated>2012-05-14T21:20:55+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2012-04-29T10:39:43+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=efc27b52594e322d4c94e379489fa3690bf74739'/>
<id>urn:sha1:efc27b52594e322d4c94e379489fa3690bf74739</id>
<content type='text'>
To make it easier to handle userspace code merges, move all the busy
extent handling out of the allocation code and into it's own file.
The userspace code does not need the busy extent code, so this
simplifies the merging of the kernel code into the userspace
xfsprogs library.

Because the busy extent code has been almost completely rewritten
over the past couple of years, also update the copyright on this new
file to include the authors that made all those changes.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Mark Tinguely &lt;tinguely@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;
</content>
</entry>
<entry>
<title>xfs: fix fstrim offset calculations</title>
<updated>2012-03-27T21:07:03+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2012-03-22T05:15:12+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a66d636385d621e98a915233250356c394a437de'/>
<id>urn:sha1:a66d636385d621e98a915233250356c394a437de</id>
<content type='text'>
xfs_ioc_fstrim() doesn't treat the incoming offset and length
correctly. It treats them as a filesystem block address, rather than
a disk address. This is wrong because the range passed in is a
linear representation, while the filesystem block address notation
is a sparse representation. Hence we cannot convert the range direct
to filesystem block units and then use that for calculating the
range to trim.

While this sounds dangerous, the problem is limited to calculating
what AGs need to be trimmed. The code that calcuates the actual
ranges to trim gets the right result (i.e. only ever discards free
space), even though it uses the wrong ranges to limit what is
trimmed. Hence this is not a bug that endangers user data.

Fix this by treating the range as a disk address range and use the
appropriate functions to convert the range into the desired formats
for calculations.

Further, fix the first free extent lookup (the longest) to actually
find the largest free extent. Currently this lookup uses a &lt;=
lookup, which results in finding the extent to the left of the
largest because we can never get an exact match on the largest
extent. This is due to the fact that while we know it's size, we
don't know it's location and so the exact match fails and we move
one record to the left to get the next largest extent. Instead, use
a &gt;= search so that the lookup returns the largest extent regardless
of the fact we don't get an exact match on it.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;
</content>
</entry>
<entry>
<title>xfs: introduce an allocation workqueue</title>
<updated>2012-03-22T21:12:24+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2012-03-22T05:15:07+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=c999a223c2f0d31c64ef7379814cea1378b2b800'/>
<id>urn:sha1:c999a223c2f0d31c64ef7379814cea1378b2b800</id>
<content type='text'>
We currently have significant issues with the amount of stack that
allocation in XFS uses, especially in the writeback path. We can
easily consume 4k of stack between mapping the page, manipulating
the bmap btree and allocating blocks from the free list. Not to
mention btree block readahead and other functionality that issues IO
in the allocation path.

As a result, we can no longer fit allocation in the writeback path
in the stack space provided on x86_64. To alleviate this problem,
introduce an allocation workqueue and move all allocations to a
seperate context. This can be easily added as an interposing layer
into xfs_alloc_vextent(), which takes a single argument structure
and does not return until the allocation is complete or has failed.

To do this, add a work structure and a completion to the allocation
args structure. This allows xfs_alloc_vextent to queue the args onto
the workqueue and wait for it to be completed by the worker. This
can be done completely transparently to the caller.

The worker function needs to ensure that it sets and clears the
PF_TRANS flag appropriately as it is being run in an active
transaction context. Work can also be queued in a memory reclaim
context, so a rescuer is needed for the workqueue.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;
</content>
</entry>
<entry>
<title>xfs: do not discard alloc btree blocks</title>
<updated>2011-05-24T16:17:22+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2011-05-04T18:55:15+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=55a7bc5a30ff2d30d8a34fea2af9fc601b32e61a'/>
<id>urn:sha1:55a7bc5a30ff2d30d8a34fea2af9fc601b32e61a</id>
<content type='text'>
Blocks for the allocation btree are allocated from and released to
the AGFL, and thus frequently reused.  Even worse we do not have an
easy way to avoid using an AGFL block when it is discarded due to
the simple FILO list of free blocks, and thus can frequently stall
on blocks that are currently undergoing a discard.

Add a flag to the busy extent tracking structure to skip the discard
for allocation btree blocks.  In normal operation these blocks are
reused frequently enough that there is no need to discard them
anyway, but if they spill over to the allocation btree as part of a
balance we "leak" blocks that we would otherwise discard.  We could
fix this by adding another flag and keeping these block in the
rbtree even after they aren't busy any more so that we could discard
them when they migrate out of the AGFL.  Given that this would cause
significant overhead I don't think it's worthwile for now.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Alex Elder &lt;aelder@sgi.com&gt;
</content>
</entry>
</feed>
