summaryrefslogtreecommitdiff
path: root/fs/hugetlbfs/inode.c
diff options
context:
space:
mode:
authorMel Gorman <mel@csn.ul.ie>2008-07-23 21:27:23 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2008-07-24 10:47:16 -0700
commita1e78772d72b2616ed20e54896e68e0e7044854e (patch)
treed752dd96c2a4fcc555779a7aa99f95069c9b95ae /fs/hugetlbfs/inode.c
parentfc1b8a73dd71226902a11928dd5500326e101df9 (diff)
downloadlwn-a1e78772d72b2616ed20e54896e68e0e7044854e.tar.gz
lwn-a1e78772d72b2616ed20e54896e68e0e7044854e.zip
hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork()
This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in a similar manner to the reservations taken for MAP_SHARED mappings. The reserve count is accounted both globally and on a per-VMA basis for private mappings. This guarantees that a process that successfully calls mmap() will successfully fault all pages in the future unless fork() is called. The characteristics of private mappings of hugetlbfs files behaviour after this patch are; 1. The process calling mmap() is guaranteed to succeed all future faults until it forks(). 2. On fork(), the parent may die due to SIGKILL on writes to the private mapping if enough pages are not available for the COW. For reasonably reliable behaviour in the face of a small huge page pool, children of hugepage-aware processes should not reference the mappings; such as might occur when fork()ing to exec(). 3. On fork(), the child VMAs inherit no reserves. Reads on pages already faulted by the parent will succeed. Successful writes will depend on enough huge pages being free in the pool. 4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper and at fault time otherwise. Before this patch, all reads or writes in the child potentially needs page allocations that can later lead to the death of the parent. This applies to reads and writes of uninstantiated pages as well as COW. After the patch it is only a write to an instantiated page that causes problems. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/hugetlbfs/inode.c')
-rw-r--r--fs/hugetlbfs/inode.c8
1 files changed, 4 insertions, 4 deletions
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index aeabf80f81a5..1576bbecd084 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -103,9 +103,9 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
ret = -ENOMEM;
len = vma_len + ((loff_t)vma->vm_pgoff << PAGE_SHIFT);
- if (vma->vm_flags & VM_MAYSHARE &&
- hugetlb_reserve_pages(inode, vma->vm_pgoff >> (HPAGE_SHIFT-PAGE_SHIFT),
- len >> HPAGE_SHIFT))
+ if (hugetlb_reserve_pages(inode,
+ vma->vm_pgoff >> (HPAGE_SHIFT-PAGE_SHIFT),
+ len >> HPAGE_SHIFT, vma))
goto out;
ret = 0;
@@ -942,7 +942,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size)
goto out_dentry;
error = -ENOMEM;
- if (hugetlb_reserve_pages(inode, 0, size >> HPAGE_SHIFT))
+ if (hugetlb_reserve_pages(inode, 0, size >> HPAGE_SHIFT, NULL))
goto out_inode;
d_instantiate(dentry, inode);