diff options
author | Mauro Carvalho Chehab <mchehab+huawei@kernel.org> | 2020-06-23 15:31:36 +0200 |
---|---|---|
committer | Jonathan Corbet <corbet@lwn.net> | 2020-06-26 11:33:35 -0600 |
commit | 800c02f5d03019716a5926b73144be3bf0276923 (patch) | |
tree | 6af8f1e36cc091b022e6b452537dc62cfddae959 /Documentation/nommu-mmap.txt | |
parent | f00c313b50def14e5d4b8423ef67344806b5bd42 (diff) | |
download | lwn-800c02f5d03019716a5926b73144be3bf0276923.tar.gz lwn-800c02f5d03019716a5926b73144be3bf0276923.zip |
docs: move nommu-mmap.txt to admin-guide and rename to ReST
The nommu-mmap.txt file provides description of user visible
behaviuour. So, move it to the admin-guide.
As it is already at the ReST, also rename it.
Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Suggested-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/3a63d1833b513700755c85bf3bda0a6c4ab56986.1592918949.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Diffstat (limited to 'Documentation/nommu-mmap.txt')
-rw-r--r-- | Documentation/nommu-mmap.txt | 283 |
1 files changed, 0 insertions, 283 deletions
diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt deleted file mode 100644 index 530fed08de2c..000000000000 --- a/Documentation/nommu-mmap.txt +++ /dev/null @@ -1,283 +0,0 @@ -============================= -No-MMU memory mapping support -============================= - -The kernel has limited support for memory mapping under no-MMU conditions, such -as are used in uClinux environments. From the userspace point of view, memory -mapping is made use of in conjunction with the mmap() system call, the shmat() -call and the execve() system call. From the kernel's point of view, execve() -mapping is actually performed by the binfmt drivers, which call back into the -mmap() routines to do the actual work. - -Memory mapping behaviour also involves the way fork(), vfork(), clone() and -ptrace() work. Under uClinux there is no fork(), and clone() must be supplied -the CLONE_VM flag. - -The behaviour is similar between the MMU and no-MMU cases, but not identical; -and it's also much more restricted in the latter case: - - (#) Anonymous mapping, MAP_PRIVATE - - In the MMU case: VM regions backed by arbitrary pages; copy-on-write - across fork. - - In the no-MMU case: VM regions backed by arbitrary contiguous runs of - pages. - - (#) Anonymous mapping, MAP_SHARED - - These behave very much like private mappings, except that they're - shared across fork() or clone() without CLONE_VM in the MMU case. Since - the no-MMU case doesn't support these, behaviour is identical to - MAP_PRIVATE there. - - (#) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, !PROT_WRITE - - In the MMU case: VM regions backed by pages read from file; changes to - the underlying file are reflected in the mapping; copied across fork. - - In the no-MMU case: - - - If one exists, the kernel will re-use an existing mapping to the - same segment of the same file if that has compatible permissions, - even if this was created by another process. - - - If possible, the file mapping will be directly on the backing device - if the backing device has the NOMMU_MAP_DIRECT capability and - appropriate mapping protection capabilities. Ramfs, romfs, cramfs - and mtd might all permit this. - - - If the backing device can't or won't permit direct sharing, - but does have the NOMMU_MAP_COPY capability, then a copy of the - appropriate bit of the file will be read into a contiguous bit of - memory and any extraneous space beyond the EOF will be cleared - - - Writes to the file do not affect the mapping; writes to the mapping - are visible in other processes (no MMU protection), but should not - happen. - - (#) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, PROT_WRITE - - In the MMU case: like the non-PROT_WRITE case, except that the pages in - question get copied before the write actually happens. From that point - on writes to the file underneath that page no longer get reflected into - the mapping's backing pages. The page is then backed by swap instead. - - In the no-MMU case: works much like the non-PROT_WRITE case, except - that a copy is always taken and never shared. - - (#) Regular file / blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE - - In the MMU case: VM regions backed by pages read from file; changes to - pages written back to file; writes to file reflected into pages backing - mapping; shared across fork. - - In the no-MMU case: not supported. - - (#) Memory backed regular file, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE - - In the MMU case: As for ordinary regular files. - - In the no-MMU case: The filesystem providing the memory-backed file - (such as ramfs or tmpfs) may choose to honour an open, truncate, mmap - sequence by providing a contiguous sequence of pages to map. In that - case, a shared-writable memory mapping will be possible. It will work - as for the MMU case. If the filesystem does not provide any such - support, then the mapping request will be denied. - - (#) Memory backed blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE - - In the MMU case: As for ordinary regular files. - - In the no-MMU case: As for memory backed regular files, but the - blockdev must be able to provide a contiguous run of pages without - truncate being called. The ramdisk driver could do this if it allocated - all its memory as a contiguous array upfront. - - (#) Memory backed chardev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE - - In the MMU case: As for ordinary regular files. - - In the no-MMU case: The character device driver may choose to honour - the mmap() by providing direct access to the underlying device if it - provides memory or quasi-memory that can be accessed directly. Examples - of such are frame buffers and flash devices. If the driver does not - provide any such support, then the mapping request will be denied. - - -Further notes on no-MMU MMAP -============================ - - (#) A request for a private mapping of a file may return a buffer that is not - page-aligned. This is because XIP may take place, and the data may not be - paged aligned in the backing store. - - (#) A request for an anonymous mapping will always be page aligned. If - possible the size of the request should be a power of two otherwise some - of the space may be wasted as the kernel must allocate a power-of-2 - granule but will only discard the excess if appropriately configured as - this has an effect on fragmentation. - - (#) The memory allocated by a request for an anonymous mapping will normally - be cleared by the kernel before being returned in accordance with the - Linux man pages (ver 2.22 or later). - - In the MMU case this can be achieved with reasonable performance as - regions are backed by virtual pages, with the contents only being mapped - to cleared physical pages when a write happens on that specific page - (prior to which, the pages are effectively mapped to the global zero page - from which reads can take place). This spreads out the time it takes to - initialize the contents of a page - depending on the write-usage of the - mapping. - - In the no-MMU case, however, anonymous mappings are backed by physical - pages, and the entire map is cleared at allocation time. This can cause - significant delays during a userspace malloc() as the C library does an - anonymous mapping and the kernel then does a memset for the entire map. - - However, for memory that isn't required to be precleared - such as that - returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to - indicate to the kernel that it shouldn't bother clearing the memory before - returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled - to permit this, otherwise the flag will be ignored. - - uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this - to allocate the brk and stack region. - - (#) A list of all the private copy and anonymous mappings on the system is - visible through /proc/maps in no-MMU mode. - - (#) A list of all the mappings in use by a process is visible through - /proc/<pid>/maps in no-MMU mode. - - (#) Supplying MAP_FIXED or a requesting a particular mapping address will - result in an error. - - (#) Files mapped privately usually have to have a read method provided by the - driver or filesystem so that the contents can be read into the memory - allocated if mmap() chooses not to map the backing device directly. An - error will result if they don't. This is most likely to be encountered - with character device files, pipes, fifos and sockets. - - -Interprocess shared memory -========================== - -Both SYSV IPC SHM shared memory and POSIX shared memory is supported in NOMMU -mode. The former through the usual mechanism, the latter through files created -on ramfs or tmpfs mounts. - - -Futexes -======= - -Futexes are supported in NOMMU mode if the arch supports them. An error will -be given if an address passed to the futex system call lies outside the -mappings made by a process or if the mapping in which the address lies does not -support futexes (such as an I/O chardev mapping). - - -No-MMU mremap -============= - -The mremap() function is partially supported. It may change the size of a -mapping, and may move it [#]_ if MREMAP_MAYMOVE is specified and if the new size -of the mapping exceeds the size of the slab object currently occupied by the -memory to which the mapping refers, or if a smaller slab object could be used. - -MREMAP_FIXED is not supported, though it is ignored if there's no change of -address and the object does not need to be moved. - -Shared mappings may not be moved. Shareable mappings may not be moved either, -even if they are not currently shared. - -The mremap() function must be given an exact match for base address and size of -a previously mapped object. It may not be used to create holes in existing -mappings, move parts of existing mappings or resize parts of mappings. It must -act on a complete mapping. - -.. [#] Not currently supported. - - -Providing shareable character device support -============================================ - -To provide shareable character device support, a driver must provide a -file->f_op->get_unmapped_area() operation. The mmap() routines will call this -to get a proposed address for the mapping. This may return an error if it -doesn't wish to honour the mapping because it's too long, at a weird offset, -under some unsupported combination of flags or whatever. - -The driver should also provide backing device information with capabilities set -to indicate the permitted types of mapping on such devices. The default is -assumed to be readable and writable, not executable, and only shareable -directly (can't be copied). - -The file->f_op->mmap() operation will be called to actually inaugurate the -mapping. It can be rejected at that point. Returning the ENOSYS error will -cause the mapping to be copied instead if NOMMU_MAP_COPY is specified. - -The vm_ops->close() routine will be invoked when the last mapping on a chardev -is removed. An existing mapping will be shared, partially or not, if possible -without notifying the driver. - -It is permitted also for the file->f_op->get_unmapped_area() operation to -return -ENOSYS. This will be taken to mean that this operation just doesn't -want to handle it, despite the fact it's got an operation. For instance, it -might try directing the call to a secondary driver which turns out not to -implement it. Such is the case for the framebuffer driver which attempts to -direct the call to the device-specific driver. Under such circumstances, the -mapping request will be rejected if NOMMU_MAP_COPY is not specified, and a -copy mapped otherwise. - -.. important:: - - Some types of device may present a different appearance to anyone - looking at them in certain modes. Flash chips can be like this; for - instance if they're in programming or erase mode, you might see the - status reflected in the mapping, instead of the data. - - In such a case, care must be taken lest userspace see a shared or a - private mapping showing such information when the driver is busy - controlling the device. Remember especially: private executable - mappings may still be mapped directly off the device under some - circumstances! - - -Providing shareable memory-backed file support -============================================== - -Provision of shared mappings on memory backed files is similar to the provision -of support for shared mapped character devices. The main difference is that the -filesystem providing the service will probably allocate a contiguous collection -of pages and permit mappings to be made on that. - -It is recommended that a truncate operation applied to such a file that -increases the file size, if that file is empty, be taken as a request to gather -enough pages to honour a mapping. This is required to support POSIX shared -memory. - -Memory backed devices are indicated by the mapping's backing device info having -the memory_backed flag set. - - -Providing shareable block device support -======================================== - -Provision of shared mappings on block device files is exactly the same as for -character devices. If there isn't a real device underneath, then the driver -should allocate sufficient contiguous memory to honour any supported mapping. - - -Adjusting page trimming behaviour -================================= - -NOMMU mmap automatically rounds up to the nearest power-of-2 number of pages -when performing an allocation. This can have adverse effects on memory -fragmentation, and as such, is left configurable. The default behaviour is to -aggressively trim allocations and discard any excess pages back in to the page -allocator. In order to retain finer-grained control over fragmentation, this -behaviour can either be disabled completely, or bumped up to a higher page -watermark where trimming begins. - -Page trimming behaviour is configurable via the sysctl ``vm.nr_trim_pages``. |