diff options
author | Wanpeng Li <liwanp@linux.vnet.ibm.com> | 2013-09-11 14:22:53 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-09-11 15:58:08 -0700 |
commit | 0cea3fdc416d593072c602725ed2ca02b889f31b (patch) | |
tree | 11249ad6aa5e61fd8dbc42d277d733061f1b15e4 | |
parent | f9121153fdfbfaa930bf65077a5597e20d3ac608 (diff) | |
download | lwn-0cea3fdc416d593072c602725ed2ca02b889f31b.tar.gz lwn-0cea3fdc416d593072c602725ed2ca02b889f31b.zip |
mm/hwpoison: fix race against poison thp
There is a race between hwpoison page and unpoison page, memory_failure
set the page hwpoison and increase num_poisoned_pages without hold page
lock, and one page count will be accounted against thp for
num_poisoned_pages. However, unpoison can occur before memory_failure
hold page lock and split transparent hugepage, unpoison will decrease
num_poisoned_pages by 1 << compound_order since memory_failure has not yet
split transparent hugepage with page lock held. That means we account one
page for hwpoison and 1 << compound_order for unpoison. This patch fix it
by inserting a PageTransHuge check before doing TestClearPageHWPoison,
unpoison failed without clearing PageHWPoison and decreasing
num_poisoned_pages.
A B
memory_failue
TestSetPageHWPoison(p);
if (PageHuge(p))
nr_pages = 1 << compound_order(hpage);
else
nr_pages = 1;
atomic_long_add(nr_pages, &num_poisoned_pages);
unpoison_memory
nr_pages = 1<< compound_trans_order(page);
if(TestClearPageHWPoison(p))
atomic_long_sub(nr_pages, &num_poisoned_pages);
lock page
if (!PageHWPoison(p))
unlock page and return
hwpoison_user_mappings
if (PageTransHuge(hpage))
split_huge_page(hpage);
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r-- | mm/memory-failure.c | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 7b5d32507c35..32351ec32048 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1342,6 +1342,16 @@ int unpoison_memory(unsigned long pfn) return 0; } + /* + * unpoison_memory() can encounter thp only when the thp is being + * worked by memory_failure() and the page lock is not held yet. + * In such case, we yield to memory_failure() and make unpoison fail. + */ + if (PageTransHuge(page)) { + pr_info("MCE: Memory failure is now running on %#lx\n", pfn); + return 0; + } + nr_pages = 1 << compound_order(page); if (!get_page_unless_zero(page)) { |