summaryrefslogtreecommitdiff
path: root/mm
diff options
context:
space:
mode:
authorDavid Rientjes <rientjes@google.com>2019-09-04 12:54:25 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2019-09-28 14:05:38 -0700
commit76e654cc91bbe627aa6067916f02a4d3ac041620 (patch)
tree338187c038bfaf4d2780ffce7e5ad6359dc97806 /mm
parentb39d0ee2632d2f4fb180e8e4eba33736283f23de (diff)
downloadlwn-76e654cc91bbe627aa6067916f02a4d3ac041620.tar.gz
lwn-76e654cc91bbe627aa6067916f02a4d3ac041620.zip
mm, page_alloc: allow hugepage fallback to remote nodes when madvised
For systems configured to always try hard to allocate transparent hugepages (thp defrag setting of "always") or for memory that has been explicitly madvised to MADV_HUGEPAGE, it is often better to fallback to remote memory to allocate the hugepage if the local allocation fails first. The point is to allow the initial call to __alloc_pages_node() to attempt to defragment local memory to make a hugepage available, if possible, rather than immediately fallback to remote memory. Local hugepages will always have a better access latency than remote (huge)pages, so an attempt to make a hugepage available locally is always preferred. If memory compaction cannot be successful locally, however, it is likely better to fallback to remote memory. This could take on two forms: either allow immediate fallback to remote memory or do per-zone watermark checks. It would be possible to fallback only when per-zone watermarks fail for order-0 memory, since that would require local reclaim for all subsequent faults so remote huge allocation is likely better than thrashing the local zone for large workloads. In this case, it is assumed that because the system is configured to try hard to allocate hugepages or the vma is advised to explicitly want to try hard for hugepages that remote allocation is better when local allocation and memory compaction have both failed. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r--mm/mempolicy.c11
1 files changed, 11 insertions, 0 deletions
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 547cd403ed02..8caab1f81a52 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2133,6 +2133,17 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
mpol_cond_put(pol);
page = __alloc_pages_node(hpage_node,
gfp | __GFP_THISNODE, order);
+
+ /*
+ * If hugepage allocations are configured to always
+ * synchronous compact or the vma has been madvised
+ * to prefer hugepage backing, retry allowing remote
+ * memory as well.
+ */
+ if (!page && (gfp & __GFP_DIRECT_RECLAIM))
+ page = __alloc_pages_node(hpage_node,
+ gfp | __GFP_NORETRY, order);
+
goto out;
}
}