summaryrefslogtreecommitdiff
path: root/drivers/mtd/mtd_blkdevs.c
AgeCommit message (Collapse)Author
2023-06-12block: replace fmode_t with a block-specific type for block open flagsChristoph Hellwig
The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12mtd: block: use a simple bool to track open for writeChristoph Hellwig
Instead of propagating the fmode_t, just use a bool to track if a mtd block device was opened for writing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20230608110258.189493-23-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: remove the unused mode argument to ->releaseChristoph Hellwig
The mode argument to the ->release block_device_operation is never used, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: pass a gendisk to ->openChristoph Hellwig
->open is only called on the whole device. Make that explicit by passing a gendisk instead of the block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-28block: remove blk_cleanup_diskChristoph Hellwig
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17block: remove QUEUE_FLAG_DISCARDChristoph Hellwig
Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-26mtd_blkdevs: avoid soft lockups with some mtd/spi devicesDavid Decotigny
With some spi devices, the heavy cpu usage due to polling the spi registers may lead to netdev timeouts, RCU complaints, etc. This can be acute in the absence of CONFIG_PREEMPT. This patch allows to give enough breathing room to avoid those incorrectly detected netdev timeouts for example. Example splat on 5.10.92: [ 828.399306] rcu: INFO: rcu_sched self-detected stall on CPU ... [ 828.419245] Task dump for CPU 1: [ 828.422465] task:kworker/1:1H state:R running task on cpu 1 stack: 0 pid: 76 ppid: 2 flags:0x0000002a [ 828.433132] Workqueue: kblockd blk_mq_run_work_fn [ 828.437820] Call trace: ... [ 828.512267] spi_mem_exec_op+0x4d0/0xde0 [ 828.516184] spi_mem_dirmap_read+0x180/0x39c [ 828.520443] spi_nor_read_data+0x428/0x7e8 [ 828.524523] spi_nor_read+0x154/0x214 [ 828.528172] mtd_read_oob+0x440/0x714 [ 828.531815] mtd_read+0xac/0x120 [ 828.535030] mtdblock_readsect+0x178/0x230 [ 828.539102] mtd_blktrans_work+0x9fc/0xf28 [ 828.543177] mtd_queue_rq+0x1ac/0x2e4 [ 828.546827] blk_mq_dispatch_rq_list+0x2cc/0xa44 [ 828.551419] blk_mq_do_dispatch_sched+0xb0/0x7cc [ 828.556010] __blk_mq_sched_dispatch_requests+0x350/0x494 [ 828.561372] blk_mq_sched_dispatch_requests+0xac/0xe4 [ 828.566387] __blk_mq_run_hw_queue+0x130/0x254 [ 828.570806] blk_mq_run_work_fn+0x50/0x60 [ 828.574814] process_one_work+0x578/0xf1c [ 828.578814] worker_thread+0x5dc/0xea0 [ 828.582547] kthread+0x270/0x2d4 [ 828.585765] ret_from_fork+0x10/0x30 Signed-off-by: David Decotigny <ddecotig@google.com> Reviewed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220126101120.676021-1-decot+git@google.com
2021-12-12mtd_blkdevs: don't scan partitions for plain mtdblockChristoph Hellwig
mtdblock / mtdblock_ro set part_bits to 0 and thus nevever scanned partitions. Restore that behavior by setting the GENHD_FL_NO_PART flag. Fixes: 1ebe2e5f9d68e94c ("block: remove GENHD_FL_EXT_DEVT") Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20211206070409.2836165-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-29mtd_blkdevs: remove the sector out of range check in do_blktrans_requestChristoph Hellwig
The block layer already performs this check, no need to duplicate it in the driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20211126121802.2090656-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-29block: remove rq_flush_dcache_pagesChristoph Hellwig
This function is trivial, and flush_dcache_page is always defined, so just open code it in the 2.5 callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20211117061404.331732-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-21mtd: add add_disk() error handlingLuis Chamberlain
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211015233028.2167651-10-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-05Merge tag 'mtd/for-5.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD updates from Miquel Raynal: "MTD changes: - blkdevs: - Simplify the refcounting in blktrans_{open, release} - Simplify blktrans_getgeo - Remove blktrans_ref_mutex - Simplify blktrans_dev_get - Use lockdep_assert_held - Don't hold del_mtd_blktrans_dev in blktrans_{open, release} - ftl: - Don't cast away the type when calling add_mtd_blktrans_dev - Don't cast away the type when calling add_mtd_blktrans_dev - Use container_of() rather than cast - Fix use-after-free - Add discard support - Allow use of MTD_RAM for testing purposes - concat: - Check _read, _write callbacks existence before assignment - Judge callback existence based on the master - maps: - Maps: remove dead MTD map driver for PMC-Sierra MSP boards - mtdblock: - Warn if added for a NAND device - Add comment about UBI block devices - Update old JFFS2 mention in Kconfig - partitions: - Redboot: convert to YAML NAND core changes: - Repair Miquel Raynal's email address in MAINTAINERS - Fix a couple of spelling mistakes in Kconfig - bbt: Skip bad blocks when searching for the BBT in NAND - Remove never changed ret variable Raw NAND changes: - cafe: Fix a resource leak in the error handling path of 'cafe_nand_probe()' - intel: Fix error handling in probe - omap: Fix kernel doc warning on 'calcuate' typo - gpmc: Fix the ECC bytes vs. OOB bytes equation SPI-NAND core changes: - Properly fill the OOB area. - Fix comment SPI-NAND drivers changes: - macronix: Add Quad support for serial NAND flash" * tag 'mtd/for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (30 commits) mtd: rawnand: cafe: Fix a resource leak in the error handling path of 'cafe_nand_probe()' mtd_blkdevs: simplify the refcounting in blktrans_{open, release} mtd_blkdevs: simplify blktrans_getgeo mtd_blkdevs: remove blktrans_ref_mutex mtd_blkdevs: simplify blktrans_dev_get mtd/rfd_ftl: don't cast away the type when calling add_mtd_blktrans_dev mtd/ftl: don't cast away the type when calling add_mtd_blktrans_dev mtd_blkdevs: use lockdep_assert_held mtd_blkdevs: don't hold del_mtd_blktrans_dev in blktrans_{open, release} mtd: rawnand: intel: Fix error handling in probe mtd: mtdconcat: Check _read, _write callbacks existence before assignment mtd: mtdconcat: Judge callback existence based on the master mtd: maps: remove dead MTD map driver for PMC-Sierra MSP boards mtd: rfd_ftl: use container_of() rather than cast mtd: rfd_ftl: fix use-after-free mtd: rfd_ftl: add discard support mtd: rfd_ftl: allow use of MTD_RAM for testing purposes mtdblock: Warn if added for a NAND device mtd: spinand: macronix: Add Quad support for serial NAND flash mtdblock: Add comment about UBI block devices ...
2021-08-23mtd_blkdevs: simplify the refcounting in blktrans_{open, release}Christoph Hellwig
Always grab a reference to the mtd_blktrans_dev in ->open instead of just on the first open, and do away with the additional temporary references in ->open and ->release. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210823073359.705281-9-hch@lst.de
2021-08-23mtd_blkdevs: simplify blktrans_getgeoChristoph Hellwig
No need to grab a mtd_blktrans_dev given that ->open already holds one and ->getgeo can only be called on an open disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210823073359.705281-8-hch@lst.de
2021-08-23mtd_blkdevs: remove blktrans_ref_mutexChristoph Hellwig
blktrans_ref_mutex is not actually needed. The kref is serialized internally, and devnum assignment in add_mtd_blktrans_dev happens before the disk is added and thus any of the block_device_operations methods otherwise using it are called. It is also already serialized by the global mtd_table_mutex. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210823073359.705281-7-hch@lst.de
2021-08-23mtd_blkdevs: simplify blktrans_dev_getChristoph Hellwig
->private_data is set before the disk is added and never cleared, so don't bother trying to handle a NULL pointer there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210823073359.705281-6-hch@lst.de
2021-08-23mtd_blkdevs: use lockdep_assert_heldChristoph Hellwig
Use lockdep_assert_held to ensure mtd_table_mutex is held instead of mutex_trylock games. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210823073359.705281-3-hch@lst.de
2021-08-23mtd_blkdevs: don't hold del_mtd_blktrans_dev in blktrans_{open, release}Christoph Hellwig
There is nothing that this protects against except for slightly reducing the window when new opens can appear just before calling del_gendisk. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210823073359.705281-2-hch@lst.de
2021-08-06mtd: fix lock hierarchy in deregister_mtd_blktransDesmond Cheong Zhi Xi
There is a lock hierarchy of major_names_lock --> mtd_table_mutex. One existing chain is as follows: 1. major_names_lock --> loop_ctl_mutex (when blk_request_module calls loop_probe) 2. loop_ctl_mutex --> bdev->bd_mutex (when loop_control_ioctl calls loop_remove, which then calls del_gendisk) 3. bdev->bd_mutex --> mtd_table_mutex (when blkdev_get_by_dev calls __blkdev_get, which then calls blktrans_open) Since unregister_blkdev grabs the major_names_lock, we need to call it outside the critical section for mtd_table_mutex, otherwise we invert the lock hierarchy. Reported-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210717100719.728829-1-desmondcheongzx@gmail.com
2021-07-16mtd: break circular locks in register_mtd_blktransDesmond Cheong Zhi Xi
Syzbot reported a circular locking dependency: https://syzkaller.appspot.com/bug?id=7bd106c28e846d1023d4ca915718b1a0905444cb This happens because of the following lock dependencies: 1. loop_ctl_mutex -> bdev->bd_mutex (when loop_control_ioctl calls loop_remove, which then calls del_gendisk; this also happens in loop_exit which eventually calls loop_remove) 2. bdev->bd_mutex -> mtd_table_mutex (when blkdev_get_by_dev calls __blkdev_get, which then calls blktrans_open) 3. mtd_table_mutex -> major_names_lock (when register_mtd_blktrans calls __register_blkdev) 4. major_names_lock -> loop_ctl_mutex (when blk_request_module calls loop_probe) Hence there's an overall dependency of: loop_ctl_mutex ----------> bdev->bd_mutex ^ | | | | v major_names_lock <--------- mtd_table_mutex We can break this circular dependency by holding mtd_table_mutex only for the required critical section in register_mtd_blktrans. This avoids the mtd_table_mutex -> major_names_lock dependency. Reported-and-tested-by: syzbot+6a8a0d93c91e8fbf2e80@syzkaller.appspotmail.com Co-developed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210617160904.570111-1-desmondcheongzx@gmail.com
2021-07-16mtd: mtd_blkdevs: Initialize rq.limits.discard_granularityZhihao Cheng
Since commit b35fd7422c2f8("block: check queue's limits.discard_granularity in __blkdev_issue_discard()") checks rq.limits.discard_granularity in __blkdev_issue_discard(), we may get following warnings on formatted ftl: WARNING: CPU: 2 PID: 7313 at block/blk-lib.c:51 __blkdev_issue_discard+0x2a7/0x390 Reproducer: 1. ftl_format /dev/mtd0 2. modprobe ftl 3. mkfs.vfat /dev/ftla 4. mount -odiscard /dev/ftla temp 5. dd if=/dev/zero of=temp/tst bs=1M count=10 oflag=direct 6. dd if=/dev/zero of=temp/tst bs=1M count=10 oflag=direct Fix it by initializing rq.limits.discard_granularity if device supports discard operation. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210615093905.3473709-1-chengzhihao1@huawei.com
2021-06-16mtd_blkdevs: initialze new->rq in add_mtd_blktrans_devChristoph Hellwig
Various places expect the request_queue in ->rq. Initialize it to avoid NULL pointer derefences. Fixes: 6966bb921def ("mtd_blkdevs: use blk_mq_alloc_disk") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-11mtd_blkdevs: use blk_mq_alloc_diskChristoph Hellwig
Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210602065345.355274-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-16mtd_blkdevs: don't override BLKFLSBUFChristoph Hellwig
BLKFLSBUF is not supposed to actually send a flush command to the device, but to tear down buffer cache structures. Remove the mtd_blkdevs implementation and just use the default semantics instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 51 franklin st fifth floor boston ma 02110 1301 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 50 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091649.499889647@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-16mtd_blkdevs: convert to blk-mqJens Axboe
Straight forward conversion, using an internal list to enable the driver to pull requests at will. Dynamically allocate the tag set to avoid having to pull in the block headers for blktrans.h, since various mtd drivers use block conflicting names for defines and functions. Cc: David Woodhouse <dwmw2@infradead.org> Cc: linux-mtd@lists.infradead.org Tested-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-28block: genhd: add 'groups' argument to device_add_diskHannes Reinecke
Update device_add_disk() to take an 'groups' argument so that individual drivers can register a device with additional sysfs attributes. This avoids race condition the driver would otherwise have if these groups were to be created with sysfs_add_groups(). Signed-off-by: Martin Wilck <martin.wilck@suse.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-11mtd_blkdevs: handle highmem pagesChristoph Hellwig
Just kmap the single payload page before passing it on to the FTL. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-08block: Use blk_queue_flag_*() in drivers instead of queue_flag_*()Bart Van Assche
This patch has been generated as follows: for verb in set_unlocked clear_unlocked set clear; do replace-in-files queue_flag_${verb} blk_queue_flag_${verb%_unlocked} \ $(git grep -lw queue_flag_${verb} drivers block/bsg*) done Except for protecting all queue flag changes with the queue lock this patch does not change any functionality. Cc: Mike Snitzer <snitzer@redhat.com> Cc: Shaohua Li <shli@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-12mtd: blkdevs: Fix mtd block write failureAbhishek Sahu
All the MTD block write requests are failing with following error messages mkfs.ext4 /dev/mtdblock0 print_req_error: I/O error, dev mtdblock0, sector 0 Buffer I/O error on dev mtdblock0, logical block 0, lost async page write The control is going to default case after block write request because of missing return. Fixes: commit 2a842acab109 ("block: introduce new block status code type") Signed-off-by: Abhishek Sahu <absahu@codeaurora.org> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2017-06-27block: don't set bounce limit in blk_init_queueChristoph Hellwig
Instead move it to the callers. Those that either don't use bio_data() or page_address() or are specific to architectures that do not support highmem are skipped. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-09block: introduce new block status code typeChristoph Hellwig
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31block: fold cmd_type into the REQ_OP_ spaceChristoph Hellwig
Instead of keeping two levels of indirection for requests types, fold it all into the operations. The little caveat here is that previously cmd_type only applied to struct request, while the request and bio op fields were set to plain REQ_OP_READ/WRITE even for passthrough operations. Instead this patch adds new REQ_OP_* for SCSI passthrough and driver private requests, althought it has to add two for each so that we can communicate the data in/out nature of the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-24Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-27block: convert to device_add_disk()Dan Williams
For block drivers that specify a parent device, convert them to use device_add_disk(). This conversion was done with the following semantic patch: @@ struct gendisk *disk; expression E; @@ - disk->driverfs_dev = E; ... - add_disk(disk); + device_add_disk(E, disk); @@ struct gendisk *disk; expression E1, E2; @@ - disk->driverfs_dev = E1; ... E2 = disk; ... - add_disk(E2); + device_add_disk(E1, E2); ...plus some manual fixups for a few missed conversions. Cc: Jens Axboe <axboe@fb.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-07block, drivers: add REQ_OP_FLUSH operationMike Christie
This adds a REQ_OP_FLUSH operation that is sent to request_fn based drivers by the block layer's flush code, instead of sending requests with the request->cmd_flags REQ_FLUSH bit set. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-07drivers: use req op accessorMike Christie
The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-04-12mtd: switch to using blk_queue_write_cache()Jens Axboe
Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2015-10-30mtd: blkdevs: fix potential deadlock + lockdep warningsBrian Norris
Commit 073db4a51ee4 ("mtd: fix: avoid race condition when accessing mtd->usecount") fixed a race condition but due to poor ordering of the mutex acquisition, introduced a potential deadlock. The deadlock can occur, for example, when rmmod'ing the m25p80 module, which will delete one or more MTDs, along with any corresponding mtdblock devices. This could potentially race with an acquisition of the block device as follows. -> blktrans_open() -> mutex_lock(&dev->lock); -> mutex_lock(&mtd_table_mutex); -> del_mtd_device() -> mutex_lock(&mtd_table_mutex); -> blktrans_notify_remove() -> del_mtd_blktrans_dev() -> mutex_lock(&dev->lock); This is a classic (potential) ABBA deadlock, which can be fixed by making the A->B ordering consistent everywhere. There was no real purpose to the ordering in the original patch, AFAIR, so this shouldn't be a problem. This ordering was actually already present in del_mtd_blktrans_dev(), for one, where the function tried to ensure that its caller already held mtd_table_mutex before it acquired &dev->lock: if (mutex_trylock(&mtd_table_mutex)) { mutex_unlock(&mtd_table_mutex); BUG(); } So, reverse the ordering of acquisition of &dev->lock and &mtd_table_mutex so we always acquire mtd_table_mutex first. Snippets of the lockdep output follow: # modprobe -r m25p80 [ 53.419251] [ 53.420838] ====================================================== [ 53.427300] [ INFO: possible circular locking dependency detected ] [ 53.433865] 4.3.0-rc6 #96 Not tainted [ 53.437686] ------------------------------------------------------- [ 53.444220] modprobe/372 is trying to acquire lock: [ 53.449320] (&new->lock){+.+...}, at: [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc [ 53.457271] [ 53.457271] but task is already holding lock: [ 53.463372] (mtd_table_mutex){+.+.+.}, at: [<c0439994>] del_mtd_device+0x18/0x100 [ 53.471321] [ 53.471321] which lock already depends on the new lock. [ 53.471321] [ 53.479856] [ 53.479856] the existing dependency chain (in reverse order) is: [ 53.487660] -> #1 (mtd_table_mutex){+.+.+.}: [ 53.492331] [<c043fc5c>] blktrans_open+0x34/0x1a4 [ 53.497879] [<c01afce0>] __blkdev_get+0xc4/0x3b0 [ 53.503364] [<c01b0bb8>] blkdev_get+0x108/0x320 [ 53.508743] [<c01713c0>] do_dentry_open+0x218/0x314 [ 53.514496] [<c0180454>] path_openat+0x4c0/0xf9c [ 53.519959] [<c0182044>] do_filp_open+0x5c/0xc0 [ 53.525336] [<c0172758>] do_sys_open+0xfc/0x1cc [ 53.530716] [<c000f740>] ret_fast_syscall+0x0/0x1c [ 53.536375] -> #0 (&new->lock){+.+...}: [ 53.540587] [<c063f124>] mutex_lock_nested+0x38/0x3cc [ 53.546504] [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc [ 53.552606] [<c043f164>] blktrans_notify_remove+0x7c/0x84 [ 53.558891] [<c04399f0>] del_mtd_device+0x74/0x100 [ 53.564544] [<c043c670>] del_mtd_partitions+0x80/0xc8 [ 53.570451] [<c0439aa0>] mtd_device_unregister+0x24/0x48 [ 53.576637] [<c046ce6c>] spi_drv_remove+0x1c/0x34 [ 53.582207] [<c03de0f0>] __device_release_driver+0x88/0x114 [ 53.588663] [<c03de19c>] device_release_driver+0x20/0x2c [ 53.594843] [<c03dd9e8>] bus_remove_device+0xd8/0x108 [ 53.600748] [<c03dacc0>] device_del+0x10c/0x210 [ 53.606127] [<c03dadd0>] device_unregister+0xc/0x20 [ 53.611849] [<c046d878>] __unregister+0x10/0x20 [ 53.617211] [<c03da868>] device_for_each_child+0x50/0x7c [ 53.623387] [<c046eae8>] spi_unregister_master+0x58/0x8c [ 53.629578] [<c03e12f0>] release_nodes+0x15c/0x1c8 [ 53.635223] [<c03de0f8>] __device_release_driver+0x90/0x114 [ 53.641689] [<c03de900>] driver_detach+0xb4/0xb8 [ 53.647147] [<c03ddc78>] bus_remove_driver+0x4c/0xa0 [ 53.652970] [<c00cab50>] SyS_delete_module+0x11c/0x1e4 [ 53.658976] [<c000f740>] ret_fast_syscall+0x0/0x1c [ 53.664621] [ 53.664621] other info that might help us debug this: [ 53.664621] [ 53.672979] Possible unsafe locking scenario: [ 53.672979] [ 53.679169] CPU0 CPU1 [ 53.683900] ---- ---- [ 53.688633] lock(mtd_table_mutex); [ 53.692383] lock(&new->lock); [ 53.698306] lock(mtd_table_mutex); [ 53.704658] lock(&new->lock); [ 53.707946] [ 53.707946] *** DEADLOCK *** Fixes: 073db4a51ee4 ("mtd: fix: avoid race condition when accessing mtd->usecount") Reported-by: Felipe Balbi <balbi@ti.com> Tested-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Cc: <stable@vger.kernel.org>
2015-09-28mtd: blktrans: fix multiplication overflowPeng Fan
In drivers/mtd/mtd_blkdevs.c: 406 set_capacity(gd, (new->size * tr->blksize) >> 9); The type of new->size is unsigned long and the type of tr->blksize is int, the result of 'new->size * tr->blksize' may exceed ULONG_MAX on 32bit machines. I use nand chip MT29F32G08CBADBWP which is 4GB and the parameters passed to kernel is 'mtdparts=gpmi-nand:-(user)', the whole nand chip will be treated as a 4GB mtd partition. new->size is 0x800000 and tr->blksize is 0x200, 'new->size * tr->blksize' however is 0. This is what we do not want to see. Using type cast u64 to fix the multiplication overflow issue. Signed-off-by: Peng Fan <van.freenix@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-09-02Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block updates from Jens Axboe: "This first core part of the block IO changes contains: - Cleanup of the bio IO error signaling from Christoph. We used to rely on the uptodate bit and passing around of an error, now we store the error in the bio itself. - Improvement of the above from myself, by shrinking the bio size down again to fit in two cachelines on x86-64. - Revert of the max_hw_sectors cap removal from a revision again, from Jeff Moyer. This caused performance regressions in various tests. Reinstate the limit, bump it to a more reasonable size instead. - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me. Most devices have huge trim limits, which can cause nasty latencies when deleting files. Enable the admin to configure the size down. We will look into having a more sane default instead of UINT_MAX sectors. - Improvement of the SGP gaps logic from Keith Busch. - Enable the block core to handle arbitrarily sized bios, which enables a nice simplification of bio_add_page() (which is an IO hot path). From Kent. - Improvements to the partition io stats accounting, making it faster. From Ming Lei. - Also from Ming Lei, a basic fixup for overflow of the sysfs pending file in blk-mq, as well as a fix for a blk-mq timeout race condition. - Ming Lin has been carrying Kents above mentioned patches forward for a while, and testing them. Ming also did a few fixes around that. - Sasha Levin found and fixed a use-after-free problem introduced by the bio->bi_error changes from Christoph. - Small blk cgroup cleanup from Viresh Kumar" * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) blk: Fix bio_io_vec index when checking bvec gaps block: Replace SG_GAPS with new queue limits mask block: bump BLK_DEF_MAX_SECTORS to 2560 Revert "block: remove artifical max_hw_sectors cap" blk-mq: fix race between timeout and freeing request blk-mq: fix buffer overflow when reading sysfs file of 'pending' Documentation: update notes in biovecs about arbitrarily sized bios block: remove bio_get_nr_vecs() fs: use helper bio_add_page() instead of open coding on bi_io_vec block: kill merge_bvec_fn() completely md/raid5: get rid of bio_fits_rdev() md/raid5: split bio for chunk_aligned_read block: remove split code in blkdev_issue_{discard,write_same} btrfs: remove bio splitting and merge_bvec_fn() calls bcache: remove driver private bio splitting code block: simplify bio_add_page() block: make generic_make_request handle arbitrarily sized bios blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) block: don't access bio->bi_error after bio_put() block: shrink struct bio down to 2 cache lines again ...
2015-08-25mtd: blkdevs: fix switch-bool compilation warningTomer Barletz
With gcc 5.1 I get: warning: switch condition has boolean value [-Wswitch-bool] Signed-off-by: Tomer Barletz <barletz@gmail.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-07-17block: have drivers use blk_queue_max_discard_sectors()Jens Axboe
Some drivers use it now, others just set the limits field manually. But in preparation for splitting this into a hard and soft limit, ensure that they all call the proper function for setting the hw limit for discards. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-21mtd: blktrans: use better error code for unimplemented ioctl()Brian Norris
In commit 50183936254b ("mtd: blktrans: change blktrans_getgeo return value") we fixed the problem that ioctl(HDIO_GETGEO) might return 0 (success) for mtdblock devices which did not implement the feature and would leave a blank (zero) result. But now, let's get the error code right. Other code paths on this ioctl tend to use -ENOTTY to notify the user that the ioctl() is not supported for the device, so let's use that instead of -EOPNOTSUPP. Suggested-by: Richard Weinberger <richard@nod.at> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-05-21mtd: blktrans: change blktrans_getgeo return valueWenlin Kang
Modify function blktrans_getgeo()'s return value to -EOPNOTSUPP when dev->tr->getgeo == NULL. We shouldn't make the return value to 0 when dev->tr->getgeo == NULL, because the function blktrans_getgeo() has an output value "hd_geometry" which is usually used by some application, if returns 0 (i.e., "success"), it will make some application get the wrong information. Signed-off-by: Wenlin Kang <wenlin.kang@windriver.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-05-12mtd: fix: avoid race condition when accessing mtd->usecountBrian Norris
On A MIPS 32-cores machine a BUG_ON was triggered because some acesses to mtd->usecount were done without taking mtd_table_mutex. kernel: Call Trace: kernel: [<ffffffff80401818>] __put_mtd_device+0x20/0x50 kernel: [<ffffffff804086f4>] blktrans_release+0x8c/0xd8 kernel: [<ffffffff802577e0>] __blkdev_put+0x1a8/0x200 kernel: [<ffffffff802579a4>] blkdev_close+0x1c/0x30 kernel: [<ffffffff8022006c>] __fput+0xac/0x250 kernel: [<ffffffff80171208>] task_work_run+0xd8/0x120 kernel: [<ffffffff8012c23c>] work_notifysig+0x10/0x18 kernel: kernel: Code: 2442ffff ac8202d8 000217fe <00020336> dc820128 10400003 00000000 0040f809 00000000 kernel: ---[ end trace 080fbb4579b47a73 ]--- Fixed by taking the mutex in blktrans_open and blktrans_release. Note that this locking is already suggested in include/linux/mtd/blktrans.h: struct mtd_blktrans_ops { ... /* Called with mtd_table_mutex held; no race with add/remove */ int (*open)(struct mtd_blktrans_dev *dev); void (*release)(struct mtd_blktrans_dev *dev); ... }; But we weren't following it. Originally reported by (and patched by) Zhang and Giuseppe, independently. Improved and rewritten. Cc: stable@vger.kernel.org Reported-by: Zhang Xingcai <zhangxingcai@huawei.com> Reported-by: Giuseppe Cantavenera <giuseppe.cantavenera.ext@nokia.com> Tested-by: Giuseppe Cantavenera <giuseppe.cantavenera.ext@nokia.com> Acked-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-03-11mtd: blkdevs: remove dead codeBrian Norris
The only exit (break) from the preceding loop is nested within a condition which yields req == NULL. This code is dead. Coverity CID #752669 Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2014-10-04block: disable entropy contributions for nonrot devicesMike Snitzer
Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set QUEUE_FLAG_NONROT. Historically, all block devices have automatically made entropy contributions. But as previously stated in commit e2e1a148 ("block: add sysfs knob for turning off disk entropy contributions"): - On SSD disks, the completion times aren't as random as they are for rotational drives. So it's questionable whether they should contribute to the random pool in the first place. - Calling add_disk_randomness() has a lot of overhead. There are more reliable sources for randomness than non-rotational block devices. From a security perspective it is better to err on the side of caution than to allow entropy contributions from unreliable "random" sources. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-06-11Merge tag 'for-linus-20140610' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull MTD updates from Brian Norris: - refactor m25p80.c driver for use as a general SPI NOR framework for other drivers which may speak to SPI NOR flash without providing full SPI support (i.e., not part of drivers/spi/) - new Freescale QuadSPI driver (utilizing new SPI NOR framework) - updates for the STMicro "FSM" SPI NOR driver - fix sync/flush behavior on mtd_blkdevs - fixup subpage write support on a few NAND drivers - correct the MTD OOB test for odd-sized OOB areas - add BCH-16 support for OMAP NAND - fix warnings and trivial refactoring - utilize new ECC DT bindings in pxa3xx NAND driver - new LPDDR NVM driver - address a few assorted bugs caught by Coverity - add new imx6sx support for GPMI NAND - use a bounce buffer for NAND when non-DMA-able buffers are used * tag 'for-linus-20140610' of git://git.infradead.org/linux-mtd: (77 commits) mtd: gpmi: add gpmi support for imx6sx mtd: maps: remove check for CONFIG_MTD_SUPERH_RESERVE mtd: bf5xx_nand: use the managed version of kzalloc mtd: pxa3xx_nand: make the driver work on big-endian systems mtd: nand: omap: fix omap_calculate_ecc_bch() for-loop error mtd: nand: r852: correct write_buf loop bounds mtd: nand_bbt: handle error case for nand_create_badblock_pattern() mtd: nand_bbt: remove unused variable mtd: maps: sc520cdp: fix warnings mtd: slram: fix unused variable warning mtd: pfow: remove unused variable mtd: lpddr: fix Kconfig dependency, for I/O accessors mtd: nand: pxa3xx: Add supported ECC strength and step size to the DT binding mtd: nand: pxa3xx: Use ECC strength and step size devicetree binding mtd: nand: pxa3xx: Clean pxa_ecc_init() error handling mtd: nand: Warn the user if the selected ECC strength is too weak mtd: nand: omap: Documentation: How to select correct ECC scheme for your device ? mtd: nand: omap: add support for BCH16_ECC - NAND driver updates mtd: nand: omap: add support for BCH16_ECC - ELM driver updates mtd: nand: omap: add support for BCH16_ECC - GPMC driver updates ...
2014-04-15mtd: mtd_blkdevs: handle REQ_FLUSH request and do explicit flush of ↵Roman Peniaev
writeback buffer mtd_blkdevs is device with volatile cache (writeback buffer), so it should support REQ_FLUSH to do explicit flush. Without this patch 'sync' does not guarantee that writeback buffer will be flushed on disk in case of power off, e.g.: $ cp some_file /mnt $ sync ### POWER OFF In case of this sequence writeback buffer will not be flushed on disk. This patch fixes this behaviour and explicitly reports to block layer that flush requests are being supported. Signed-off-by: Roman Peniaev <r.peniaev@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: linux-mtd@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Brian Norris <computersforpeace@gmail.com>