diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-10-11 09:04:23 +0900 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-10-11 09:04:23 +0900 |
commit | ce40be7a820bb393ac4ac69865f018d2f4038cf0 (patch) | |
tree | b1fe5a93346eb06f22b1c303d63ec5456d7212ab /Documentation | |
parent | ba0a5a36f60e4c1152af3a2ae2813251974405bf (diff) | |
parent | 02f3939e1a9357b7c370a4a69717cf9c02452737 (diff) | |
download | lwn-ce40be7a820bb393ac4ac69865f018d2f4038cf0.tar.gz lwn-ce40be7a820bb393ac4ac69865f018d2f4038cf0.zip |
Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block
Pull block IO update from Jens Axboe:
"Core block IO bits for 3.7. Not a huge round this time, it contains:
- First series from Kent cleaning up and generalizing bio allocation
and freeing.
- WRITE_SAME support from Martin.
- Mikulas patches to prevent O_DIRECT crashes when someone changes
the block size of a device.
- Make bio_split() work on data-less bio's (like trim/discards).
- A few other minor fixups."
Fixed up silent semantic mis-merge as per Mikulas Patocka and Andrew
Morton. It is due to the VM no longer using a prio-tree (see commit
6b2dbba8b6ac: "mm: replace vma prio_tree with an interval tree").
So make set_blocksize() use mapping_mapped() instead of open-coding the
internal VM knowledge that has changed.
* 'for-3.7/core' of git://git.kernel.dk/linux-block: (26 commits)
block: makes bio_split support bio without data
scatterlist: refactor the sg_nents
scatterlist: add sg_nents
fs: fix include/percpu-rwsem.h export error
percpu-rw-semaphore: fix documentation typos
fs/block_dev.c:1644:5: sparse: symbol 'blkdev_mmap' was not declared
blockdev: turn a rw semaphore into a percpu rw semaphore
Fix a crash when block device is read and block size is changed at the same time
block: fix request_queue->flags initialization
block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
block: ioctl to zero block ranges
block: Make blkdev_issue_zeroout use WRITE SAME
block: Implement support for WRITE SAME
block: Consolidate command flag and queue limit checks for merges
block: Clean up special command handling logic
block/blk-tag.c: Remove useless kfree
block: remove the duplicated setting for congestion_threshold
block: reject invalid queue attribute values
block: Add bio_clone_bioset(), bio_clone_kmalloc()
block: Consolidate bio_alloc_bioset(), bio_kmalloc()
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-block | 14 | ||||
-rw-r--r-- | Documentation/block/biodoc.txt | 5 | ||||
-rw-r--r-- | Documentation/percpu-rw-semaphore.txt | 27 |
3 files changed, 41 insertions, 5 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index c1eb41cb9876..279da08f7541 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -206,3 +206,17 @@ Description: when a discarded area is read the discard_zeroes_data parameter will be set to one. Otherwise it will be 0 and the result of reading a discarded area is undefined. + +What: /sys/block/<disk>/queue/write_same_max_bytes +Date: January 2012 +Contact: Martin K. Petersen <martin.petersen@oracle.com> +Description: + Some devices support a write same operation in which a + single data block can be written to a range of several + contiguous blocks on storage. This can be used to wipe + areas on disk or to initialize drives in a RAID + configuration. write_same_max_bytes indicates how many + bytes can be written in a single write same command. If + write_same_max_bytes is 0, write same is not supported + by the device. + diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt index e418dc0a7086..8df5e8e6dceb 100644 --- a/Documentation/block/biodoc.txt +++ b/Documentation/block/biodoc.txt @@ -465,7 +465,6 @@ struct bio { bio_end_io_t *bi_end_io; /* bi_end_io (bio) */ atomic_t bi_cnt; /* pin count: free when it hits zero */ void *bi_private; - bio_destructor_t *bi_destructor; /* bi_destructor (bio) */ }; With this multipage bio design: @@ -647,10 +646,6 @@ for a non-clone bio. There are the 6 pools setup for different size biovecs, so bio_alloc(gfp_mask, nr_iovecs) will allocate a vec_list of the given size from these slabs. -The bi_destructor() routine takes into account the possibility of the bio -having originated from a different source (see later discussions on -n/w to block transfers and kvec_cb) - The bio_get() routine may be used to hold an extra reference on a bio prior to i/o submission, if the bio fields are likely to be accessed after the i/o is issued (since the bio may otherwise get freed in case i/o completion diff --git a/Documentation/percpu-rw-semaphore.txt b/Documentation/percpu-rw-semaphore.txt new file mode 100644 index 000000000000..7d3c82431909 --- /dev/null +++ b/Documentation/percpu-rw-semaphore.txt @@ -0,0 +1,27 @@ +Percpu rw semaphores +-------------------- + +Percpu rw semaphores is a new read-write semaphore design that is +optimized for locking for reading. + +The problem with traditional read-write semaphores is that when multiple +cores take the lock for reading, the cache line containing the semaphore +is bouncing between L1 caches of the cores, causing performance +degradation. + +Locking for reading is very fast, it uses RCU and it avoids any atomic +instruction in the lock and unlock path. On the other hand, locking for +writing is very expensive, it calls synchronize_rcu() that can take +hundreds of milliseconds. + +The lock is declared with "struct percpu_rw_semaphore" type. +The lock is initialized percpu_init_rwsem, it returns 0 on success and +-ENOMEM on allocation failure. +The lock must be freed with percpu_free_rwsem to avoid memory leak. + +The lock is locked for read with percpu_down_read, percpu_up_read and +for write with percpu_down_write, percpu_up_write. + +The idea of using RCU for optimized rw-lock was introduced by +Eric Dumazet <eric.dumazet@gmail.com>. +The code was written by Mikulas Patocka <mpatocka@redhat.com> |