<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/fs/jbd2/journal.c, branch docs-5.6</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.6</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.6'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2019-11-05T21:02:20+00:00</updated>
<entry>
<title>Merge branch 'jk/jbd2-revoke-overflow'</title>
<updated>2019-11-05T21:02:20+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2019-11-05T21:02:20+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a6d4040846bff49c7e870cee5693245f87f2cfce'/>
<id>urn:sha1:a6d4040846bff49c7e870cee5693245f87f2cfce</id>
<content type='text'>
</content>
</entry>
<entry>
<title>jbd2: Reserve space for revoke descriptor blocks</title>
<updated>2019-11-05T21:00:48+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2019-11-05T16:44:26+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=fdc3ef882a5d59c1709a13b5486ae2b1632e12b6'/>
<id>urn:sha1:fdc3ef882a5d59c1709a13b5486ae2b1632e12b6</id>
<content type='text'>
Extend functions for starting, extending, and restarting transaction
handles to take number of revoke records handle must be able to
accommodate. These functions then make sure transaction has enough
credits to be able to store resulting revoke descriptor blocks. Also
revoke code tracks number of revoke records created by a handle to catch
situation where some place didn't reserve enough space for revoke
records. Similarly to standard transaction credits, space for unused
reserved revoke records is released when the handle is stopped.

On the ext4 side we currently take a simplistic approach of reserving
space for 1024 revoke records for any transaction. This grows amount of
credits reserved for each handle only by a few and is enough for any
normal workload so that we don't hit warnings in jbd2. We will refine
the logic in following commits.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20191105164437.32602-20-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: Account descriptor blocks into t_outstanding_credits</title>
<updated>2019-11-05T21:00:48+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2019-11-05T16:44:24+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=9f356e5a4f12008fa0df8b6385fc0ab830416e72'/>
<id>urn:sha1:9f356e5a4f12008fa0df8b6385fc0ab830416e72</id>
<content type='text'>
Currently, journal descriptor blocks were not accounted in
transaction-&gt;t_outstanding_credits and we were just leaving some slack
space in the journal for them (in jbd2_log_space_left() and
jbd2_space_needed()). This is making proper accounting (and reservation
we want to add) of descriptor blocks difficult so switch to accounting
descriptor blocks in transaction-&gt;t_outstanding_credits and just reserve
the same amount of credits in t_outstanding credits for journal
descriptor blocks when creating transaction.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20191105164437.32602-18-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: Completely fill journal descriptor blocks</title>
<updated>2019-11-05T17:13:25+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2019-11-05T16:44:09+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=b90bfdf581194a0fa5f6c26fef1e522f15f6212e'/>
<id>urn:sha1:b90bfdf581194a0fa5f6c26fef1e522f15f6212e</id>
<content type='text'>
With 32-bit block numbers, we don't allocate the array for journal
buffer heads large enough for corresponding descriptor tags to fill the
descriptor block. Thus we end up writing out half-full descriptor blocks
to the journal unnecessarily growing the transaction. Fix the logic to
allocate the array large enough.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20191105164437.32602-3-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: Free journal head outside of locked region</title>
<updated>2019-10-21T13:16:46+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-08-09T12:42:33+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=7855a57d008b2354dd52078974529e08b889f98a'/>
<id>urn:sha1:7855a57d008b2354dd52078974529e08b889f98a</id>
<content type='text'>
On PREEMPT_RT bit-spinlocks have the same semantics as on PREEMPT_RT=n,
i.e. they disable preemption. That means functions which are not safe to be
called in preempt disabled context on RT trigger a might_sleep() assert.

The journal head bit spinlock is mostly held for short code sequences with
trivial RT safe functionality, except for one place:

jbd2_journal_put_journal_head() invokes __journal_remove_journal_head()
with the journal head bit spinlock held. __journal_remove_journal_head()
invokes kmem_cache_free() which must not be called with preemption disabled
on RT.

Jan suggested to rework the removal function so the actual free happens
outside the bit-spinlocked region.

Split it into two parts:

  - Do the sanity checks and the buffer head detach under the lock

  - Do the actual free after dropping the lock

There is error case handling in the free part which needs to dereference
the b_size field of the now detached buffer head. Due to paranoia (caused
by ignorance) the size is retrieved in the detach function and handed into
the free function. Might be over-engineered, but better safe than sorry.

This makes the journal head bit-spinlock usage RT compliant and also avoids
nested locking which is not covered by lockdep.

Suggested-by: Jan Kara &lt;jack@suse.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-ext4@vger.kernel.org
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Jan Kara &lt;jack@suse.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20190809124233.13277-8-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: Make state lock a spinlock</title>
<updated>2019-10-21T13:16:46+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-08-09T12:42:32+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=464170647b5648bb81f3615567485fcb9a685bed'/>
<id>urn:sha1:464170647b5648bb81f3615567485fcb9a685bed</id>
<content type='text'>
Bit-spinlocks are problematic on PREEMPT_RT if functions which might sleep
on RT, e.g. spin_lock(), alloc/free(), are invoked inside the lock held
region because bit spinlocks disable preemption even on RT.

A first attempt was to replace state lock with a spinlock placed in struct
buffer_head and make the locking conditional on PREEMPT_RT and
DEBUG_BIT_SPINLOCKS.

Jan pointed out that there is a 4 byte hole in struct journal_head where a
regular spinlock fits in and he would not object to convert the state lock
to a spinlock unconditionally.

Aside of solving the RT problem, this also gains lockdep coverage for the
journal head state lock (bit-spinlocks are not covered by lockdep as it's
hard to fit a lockdep map into a single bit).

The trivial change would have been to convert the jbd_*lock_bh_state()
inlines, but that comes with the downside that these functions take a
buffer head pointer which needs to be converted to a journal head pointer
which adds another level of indirection.

As almost all functions which use this lock have a journal head pointer
readily available, it makes more sense to remove the lock helper inlines
and write out spin_*lock() at all call sites.

Fixup all locking comments as well.

Suggested-by: Jan Kara &lt;jack@suse.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Mark Fasheh &lt;mark@fasheh.com&gt;
Cc: Joseph Qi &lt;joseph.qi@linux.alibaba.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Cc: Jan Kara &lt;jack@suse.com&gt;
Cc: linux-ext4@vger.kernel.org
Link: https://lore.kernel.org/r/20190809124233.13277-7-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>jbd2: remove jbd2_journal_inode_add_[write|wait]</title>
<updated>2019-09-24T22:54:07+00:00</updated>
<author>
<name>Joseph Qi</name>
<email>joseph.qi@linux.alibaba.com</email>
</author>
<published>2019-09-23T22:33:11+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=963abb9aebcdacf5c709a36da9ac0727a33671eb'/>
<id>urn:sha1:963abb9aebcdacf5c709a36da9ac0727a33671eb</id>
<content type='text'>
Since ext4/ocfs2 are using jbd2_inode dirty range scoping APIs now,
jbd2_journal_inode_add_[write|wait] are not used any more, remove them.

Link: http://lkml.kernel.org/r/1562977611-8412-2-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi &lt;joseph.qi@linux.alibaba.com&gt;
Reviewed-by: Ross Zwisler &lt;zwisler@google.com&gt;
Acked-by: Changwei Ge &lt;chge@linux.alibaba.com&gt;
Cc: Gang He &lt;ghe@suse.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Cc: Joseph Qi &lt;jiangqi903@gmail.com&gt;
Cc: Jun Piao &lt;piaojun@huawei.com&gt;
Cc: Junxiao Bi &lt;junxiao.bi@oracle.com&gt;
Cc: Mark Fasheh &lt;mark@fasheh.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>jbd2: drop declaration of journal_sync_buffer()</title>
<updated>2019-06-20T21:32:21+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2019-06-20T21:32:21+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=9382cde8cd8fb941fc333b644a5772d02e1ff924'/>
<id>urn:sha1:9382cde8cd8fb941fc333b644a5772d02e1ff924</id>
<content type='text'>
The journal_sync_buffer() function was never carried over from jbd to
jbd2.  So get rid of the vestigal declaration of this (non-existent)
function.

Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
<entry>
<title>jbd2: introduce jbd2_inode dirty range scoping</title>
<updated>2019-06-20T21:24:56+00:00</updated>
<author>
<name>Ross Zwisler</name>
<email>zwisler@chromium.org</email>
</author>
<published>2019-06-20T21:24:56+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=6ba0e7dc64a5adcda2fbe65adc466891795d639e'/>
<id>urn:sha1:6ba0e7dc64a5adcda2fbe65adc466891795d639e</id>
<content type='text'>
Currently both journal_submit_inode_data_buffers() and
journal_finish_inode_data_buffers() operate on the entire address space
of each of the inodes associated with a given journal entry.  The
consequence of this is that if we have an inode where we are constantly
appending dirty pages we can end up waiting for an indefinite amount of
time in journal_finish_inode_data_buffers() while we wait for all the
pages under writeback to be written out.

The easiest way to cause this type of workload is do just dd from
/dev/zero to a file until it fills the entire filesystem.  This can
cause journal_finish_inode_data_buffers() to wait for the duration of
the entire dd operation.

We can improve this situation by scoping each of the inode dirty ranges
associated with a given transaction.  We do this via the jbd2_inode
structure so that the scoping is contained within jbd2 and so that it
follows the lifetime and locking rules for that structure.

This allows us to limit the writeback &amp; wait in
journal_submit_inode_data_buffers() and
journal_finish_inode_data_buffers() respectively to the dirty range for
a given struct jdb2_inode, keeping us from waiting forever if the inode
in question is still being appended to.

Signed-off-by: Ross Zwisler &lt;zwisler@google.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>jbd2: fix some print format mistakes</title>
<updated>2019-05-30T19:08:34+00:00</updated>
<author>
<name>Gaowei Pu</name>
<email>pugaowei@gmail.com</email>
</author>
<published>2019-05-30T19:08:34+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=7821ce417ec730a80d46ad0a513a84692433a5e8'/>
<id>urn:sha1:7821ce417ec730a80d46ad0a513a84692433a5e8</id>
<content type='text'>
There are some print format mistakes in debug messages. Fix them.

Signed-off-by: Gaowei Pu &lt;pugaowei@gmail.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;

</content>
</entry>
</feed>
