diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-10-11 09:04:23 +0900 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-10-11 09:04:23 +0900 |
commit | ce40be7a820bb393ac4ac69865f018d2f4038cf0 (patch) | |
tree | b1fe5a93346eb06f22b1c303d63ec5456d7212ab /Documentation/percpu-rw-semaphore.txt | |
parent | ba0a5a36f60e4c1152af3a2ae2813251974405bf (diff) | |
parent | 02f3939e1a9357b7c370a4a69717cf9c02452737 (diff) | |
download | lwn-ce40be7a820bb393ac4ac69865f018d2f4038cf0.tar.gz lwn-ce40be7a820bb393ac4ac69865f018d2f4038cf0.zip |
Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block
Pull block IO update from Jens Axboe:
"Core block IO bits for 3.7. Not a huge round this time, it contains:
- First series from Kent cleaning up and generalizing bio allocation
and freeing.
- WRITE_SAME support from Martin.
- Mikulas patches to prevent O_DIRECT crashes when someone changes
the block size of a device.
- Make bio_split() work on data-less bio's (like trim/discards).
- A few other minor fixups."
Fixed up silent semantic mis-merge as per Mikulas Patocka and Andrew
Morton. It is due to the VM no longer using a prio-tree (see commit
6b2dbba8b6ac: "mm: replace vma prio_tree with an interval tree").
So make set_blocksize() use mapping_mapped() instead of open-coding the
internal VM knowledge that has changed.
* 'for-3.7/core' of git://git.kernel.dk/linux-block: (26 commits)
block: makes bio_split support bio without data
scatterlist: refactor the sg_nents
scatterlist: add sg_nents
fs: fix include/percpu-rwsem.h export error
percpu-rw-semaphore: fix documentation typos
fs/block_dev.c:1644:5: sparse: symbol 'blkdev_mmap' was not declared
blockdev: turn a rw semaphore into a percpu rw semaphore
Fix a crash when block device is read and block size is changed at the same time
block: fix request_queue->flags initialization
block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
block: ioctl to zero block ranges
block: Make blkdev_issue_zeroout use WRITE SAME
block: Implement support for WRITE SAME
block: Consolidate command flag and queue limit checks for merges
block: Clean up special command handling logic
block/blk-tag.c: Remove useless kfree
block: remove the duplicated setting for congestion_threshold
block: reject invalid queue attribute values
block: Add bio_clone_bioset(), bio_clone_kmalloc()
block: Consolidate bio_alloc_bioset(), bio_kmalloc()
...
Diffstat (limited to 'Documentation/percpu-rw-semaphore.txt')
-rw-r--r-- | Documentation/percpu-rw-semaphore.txt | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/Documentation/percpu-rw-semaphore.txt b/Documentation/percpu-rw-semaphore.txt new file mode 100644 index 000000000000..7d3c82431909 --- /dev/null +++ b/Documentation/percpu-rw-semaphore.txt @@ -0,0 +1,27 @@ +Percpu rw semaphores +-------------------- + +Percpu rw semaphores is a new read-write semaphore design that is +optimized for locking for reading. + +The problem with traditional read-write semaphores is that when multiple +cores take the lock for reading, the cache line containing the semaphore +is bouncing between L1 caches of the cores, causing performance +degradation. + +Locking for reading is very fast, it uses RCU and it avoids any atomic +instruction in the lock and unlock path. On the other hand, locking for +writing is very expensive, it calls synchronize_rcu() that can take +hundreds of milliseconds. + +The lock is declared with "struct percpu_rw_semaphore" type. +The lock is initialized percpu_init_rwsem, it returns 0 on success and +-ENOMEM on allocation failure. +The lock must be freed with percpu_free_rwsem to avoid memory leak. + +The lock is locked for read with percpu_down_read, percpu_up_read and +for write with percpu_down_write, percpu_up_write. + +The idea of using RCU for optimized rw-lock was introduced by +Eric Dumazet <eric.dumazet@gmail.com>. +The code was written by Mikulas Patocka <mpatocka@redhat.com> |