mm anon rmap: remove anon_vma_moveto_tail

mremap() had a clever optimization where move_ptes() did not take the anon_vma lock to avoid a race with anon rmap users such as page migration. Instead, the avc's were ordered in such a way that the origin vma was always visited by rmap before the destination. This ordering and the use of page table locks rmap usage safe. However, we want to replace the use of linked lists in anon rmap with an interval tree, and this will make it harder to impose such ordering as the interval tree will always be sorted by the avc->vma->vm_pgoff value. For now, let's replace the anon_vma_moveto_tail() ordering function with proper anon_vma locking in move_ptes(). Once we have the anon interval tree in place, we will re-introduce an optimization to avoid taking these locks in the most common cases. Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Daniel Santos <daniel.santos@pobox.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Michel Lespinasse <walken@google.com> 2012-10-08 16:31:36 -0700
committer: Linus Torvalds <torvalds@linux-foundation.org> 2012-10-09 16:22:41 +0900
commit: 108d6642ad81bb1d62b401490a334d2c12397517 (patch)
tree: 27df7d1777d80b9dddeaefaac928b726ff82a816 /mm/mremap.c
parent: 9826a516ff77c5820e591211e4f3e58ff36f46be (diff)
download: lwn-108d6642ad81bb1d62b401490a334d2c12397517.tar.gz
lwn-108d6642ad81bb1d62b401490a334d2c12397517.zip
1 files changed, 5 insertions, 9 deletions
diff --git a/mm/mremap.c b/mm/mremap.c
index cc06d0e48d05..5588bb6e9295 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -74,6 +74,7 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
 		unsigned long new_addr)
 {
 	struct address_space *mapping = NULL;
+	struct anon_vma *anon_vma = vma->anon_vma;
 	struct mm_struct *mm = vma->vm_mm;
 	pte_t *old_pte, *new_pte, pte;
 	spinlock_t *old_ptl, *new_ptl;
@@ -88,6 +89,8 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
 		mapping = vma->vm_file->f_mapping;
 		mutex_lock(&mapping->i_mmap_mutex);
 	}
+	if (anon_vma)
+		anon_vma_lock(anon_vma);
 
 	/*
 	 * We don't have to worry about the ordering of src and dst
@@ -114,6 +117,8 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
 		spin_unlock(new_ptl);
 	pte_unmap(new_pte - 1);
 	pte_unmap_unlock(old_pte - 1, old_ptl);
+	if (anon_vma)
+		anon_vma_unlock(anon_vma);
 	if (mapping)
 		mutex_unlock(&mapping->i_mmap_mutex);
 }
@@ -221,15 +226,6 @@ static unsigned long move_vma(struct vm_area_struct *vma,
 	moved_len = move_page_tables(vma, old_addr, new_vma, new_addr, old_len);
 	if (moved_len < old_len) {
 		/*
-		 * Before moving the page tables from the new vma to
-		 * the old vma, we need to be sure the old vma is
-		 * queued after new vma in the same_anon_vma list to
-		 * prevent SMP races with rmap_walk (that could lead
-		 * rmap_walk to miss some page table).
-		 */
-		anon_vma_moveto_tail(vma);
-
-		/*
 		 * On error, move entries back from new area to old,
 		 * which will succeed since page tables still there,
 		 * and then proceed to unmap new area instead of old.
author	Michel Lespinasse <walken@google.com>	2012-10-08 16:31:36 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:41 +0900
commit	108d6642ad81bb1d62b401490a334d2c12397517 (patch)
tree	27df7d1777d80b9dddeaefaac928b726ff82a816 /mm/mremap.c
parent	9826a516ff77c5820e591211e4f3e58ff36f46be (diff)
download	lwn-108d6642ad81bb1d62b401490a334d2c12397517.tar.gz lwn-108d6642ad81bb1d62b401490a334d2c12397517.zip