diff options
author | Hugh Dickins <hughd@google.com> | 2023-06-08 18:18:49 -0700 |
---|---|---|
committer | Andrew Morton <akpm@linux-foundation.org> | 2023-06-19 16:19:14 -0700 |
commit | be872f83bf571f4f9a0ac25e2c9c36e905a36619 (patch) | |
tree | de0eabc968377f347394dca48b93dbebe79d938d /mm/pagewalk.c | |
parent | 7780d04046a2288ab85d88bedacc60fa4fad9971 (diff) | |
download | lwn-be872f83bf571f4f9a0ac25e2c9c36e905a36619.tar.gz lwn-be872f83bf571f4f9a0ac25e2c9c36e905a36619.zip |
mm/pagewalk: walk_pte_range() allow for pte_offset_map()
walk_pte_range() has a no_vma option to serve walk_page_range_novma(). I
don't know of any problem, but it looks safer to check for init_mm, and
use pte_offset_kernel() rather than pte_offset_map() in that case:
pte_offset_map()'s pmdval validation is intended for userspace.
Allow for its pte_offset_map() or pte_offset_map_lock() to fail, and retry
with ACTION_AGAIN if so. Add a second check for ACTION_AGAIN in
walk_pmd_range(), to catch it after return from walk_pte_range().
Remove the pmd_trans_unstable() check after split_huge_pmd() in
walk_pmd_range(): walk_pte_range() now handles those cases safely (and
they must fail powerpc's is_hugepd() check).
Link: https://lkml.kernel.org/r/3eba6f0-2b-fb66-6bb6-2ee8533e221@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zack Rusin <zackr@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'mm/pagewalk.c')
-rw-r--r-- | mm/pagewalk.c | 33 |
1 files changed, 23 insertions, 10 deletions
diff --git a/mm/pagewalk.c b/mm/pagewalk.c index cb23f8a15c13..64437105fe0d 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -46,15 +46,27 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, spinlock_t *ptl; if (walk->no_vma) { - pte = pte_offset_map(pmd, addr); - err = walk_pte_range_inner(pte, addr, end, walk); - pte_unmap(pte); + /* + * pte_offset_map() might apply user-specific validation. + */ + if (walk->mm == &init_mm) + pte = pte_offset_kernel(pmd, addr); + else + pte = pte_offset_map(pmd, addr); + if (pte) { + err = walk_pte_range_inner(pte, addr, end, walk); + if (walk->mm != &init_mm) + pte_unmap(pte); + } } else { pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); - err = walk_pte_range_inner(pte, addr, end, walk); - pte_unmap_unlock(pte, ptl); + if (pte) { + err = walk_pte_range_inner(pte, addr, end, walk); + pte_unmap_unlock(pte, ptl); + } } - + if (!pte) + walk->action = ACTION_AGAIN; return err; } @@ -141,11 +153,8 @@ again: !(ops->pte_entry)) continue; - if (walk->vma) { + if (walk->vma) split_huge_pmd(walk->vma, pmd, addr); - if (pmd_trans_unstable(pmd)) - goto again; - } if (is_hugepd(__hugepd(pmd_val(*pmd)))) err = walk_hugepd_range((hugepd_t *)pmd, addr, next, walk, PMD_SHIFT); @@ -153,6 +162,10 @@ again: err = walk_pte_range(pmd, addr, next, walk); if (err) break; + + if (walk->action == ACTION_AGAIN) + goto again; + } while (pmd++, addr = next, addr != end); return err; |