summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPaul Jackson <pj@sgi.com>2006-05-22 17:56:07 -0700
committerChris Wright <chrisw@sous-sol.org>2006-06-05 10:18:13 -0700
commite81ccf5afaf9cff1ed27c29ff96795c146a3d571 (patch)
tree74db8a3678d23f171ddced32d75bac5daa679adf
parent3e211bbe2295ff89490f01d54c6300a8a58fac7c (diff)
downloadlwn-e81ccf5afaf9cff1ed27c29ff96795c146a3d571.tar.gz
lwn-e81ccf5afaf9cff1ed27c29ff96795c146a3d571.zip
[PATCH] Cpuset: might sleep checking zones allowed fix
Fix an infrequently encountered 'sleeping function called from invalid context' in the cpuset hooks in __alloc_pages. Could sleep while interrupts disabled. The routine cpuset_zone_allowed() is called by code in mm/page_alloc.c __alloc_pages() to determine if a zone is allowed in the current tasks cpuset. This routine can sleep, for certain GFP_KERNEL allocations, if the zone is on a memory node not allowed in the current cpuset, but might be allowed in a parent cpuset. But we can't sleep in __alloc_pages() if in interrupt, nor if called for a GFP_ATOMIC request (__GFP_WAIT not set in gfp_flags). The rule was intended to be: Don't call cpuset_zone_allowed() if you can't sleep, unless you pass in the __GFP_HARDWALL flag set in gfp_flag, which disables the code that might scan up ancestor cpusets and sleep. This rule was being violated due to a bogus change made (by myself, pj) to __alloc_pages() as part of the November 2005 effort to cleanup its logic. The bogus change can be seen at: http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-11/4691.html [PATCH 01/05] mm fix __alloc_pages cpuset ALLOC_* flags This was first noticed on a tight memory system, in code that was disabling interrupts and doing allocation requests with __GFP_WAIT not set, which resulted in __might_sleep() writing complaints to the log "Debug: sleeping function called ...", when the code in cpuset_zone_allowed() tried to take the callback_sem cpuset semaphore. Special thanks to Dave Chinner, for figuring this out, and a tip of the hat to Nick Piggin who warned me of this back in Nov 2005, before I was ready to listen. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-rw-r--r--mm/page_alloc.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 61de2220231e..8b3cde1eb45e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -949,7 +949,8 @@ restart:
alloc_flags |= ALLOC_HARDER;
if (gfp_mask & __GFP_HIGH)
alloc_flags |= ALLOC_HIGH;
- alloc_flags |= ALLOC_CPUSET;
+ if (wait)
+ alloc_flags |= ALLOC_CPUSET;
/*
* Go through the zonelist again. Let __GFP_HIGH and allocations