diff options
author | Yongseok Koh <yongseok.koh@samsung.com> | 2010-01-19 17:33:49 +0900 |
---|---|---|
committer | Greg Kroah-Hartman <gregkh@suse.de> | 2010-01-25 10:49:43 -0800 |
commit | f2fa92b29da64ead309293a048fe41776338a280 (patch) | |
tree | 2d6efa1d3019393e6c59f265ac710c9e4bb752b9 | |
parent | 3d0cc9a9a1cd4a043ffcd3251161d07a846a103c (diff) | |
download | lwn-f2fa92b29da64ead309293a048fe41776338a280.tar.gz lwn-f2fa92b29da64ead309293a048fe41776338a280.zip |
vmalloc: remove BUG_ON due to racy counting of VM_LAZY_FREE
commit 88f5004430babb836cfce886d5d54c82166f8ba4 upstream.
In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE first, and
then vmap_lazy_nr is increased atomically.
But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, nr
is counted by checking VM_LAZY_FREE is set to va->flags. After counting
the variable nr, kernel reads vmap_lazy_nr atomically and checks a
BUG_ON condition whether nr is greater than vmap_lazy_nr to prevent
vmap_lazy_nr from being negative.
The problem is that, if interrupted right after marking VM_LAZY_FREE,
increment of vmap_lazy_nr can be delayed. Consequently, BUG_ON
condition can be met because nr is counted more than vmap_lazy_nr.
It is highly probable when vmalloc/vfree are called frequently. This
scenario have been verified by adding delay between marking VM_LAZY_FREE
and increasing vmap_lazy_nr in free_unmap_area_noflush().
Even the vmap_lazy_nr is for checking high watermark, it never be the
strict watermark. Although the BUG_ON condition is to prevent
vmap_lazy_nr from being negative, vmap_lazy_nr is signed variable. So,
it could go down to negative value temporarily.
Consequently, removing the BUG_ON condition is proper.
A possible BUG_ON message is like the below.
kernel BUG at mm/vmalloc.c:517!
invalid opcode: 0000 [#1] SMP
EIP: 0060:[<c04824a4>] EFLAGS: 00010297 CPU: 3
EIP is at __purge_vmap_area_lazy+0x144/0x150
EAX: ee8a8818 EBX: c08e77d4 ECX: e7c7ae40 EDX: c08e77ec
ESI: 000081fe EDI: e7c7ae60 EBP: e7c7ae64 ESP: e7c7ae3c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Call Trace:
[<c0482ad9>] free_unmap_vmap_area_noflush+0x69/0x70
[<c0482b02>] remove_vm_area+0x22/0x70
[<c0482c15>] __vunmap+0x45/0xe0
[<c04831ec>] vmalloc+0x2c/0x30
Code: 8d 59 e0 eb 04 66 90 89 cb 89 d0 e8 87 fe ff ff 8b 43 20 89 da 8d 48 e0 8d 43 20 3b 04 24 75 e7 fe 05 a8 a5 a3 c0 e9 78 ff ff ff <0f> 0b eb fe 90 8d b4 26 00 00 00 00 56 89 c6 b8 ac a5 a3 c0 31
EIP: [<c04824a4>] __purge_vmap_area_lazy+0x144/0x150 SS:ESP 0068:e7c7ae3c
[ See also http://marc.info/?l=linux-kernel&m=126335856228090&w=2 ]
Signed-off-by: Yongseok Koh <yongseok.koh@samsung.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-rw-r--r-- | mm/vmalloc.c | 4 |
1 files changed, 1 insertions, 3 deletions
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 775872681d0c..a3a99d39188b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -555,10 +555,8 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, } rcu_read_unlock(); - if (nr) { - BUG_ON(nr > atomic_read(&vmap_lazy_nr)); + if (nr) atomic_sub(nr, &vmap_lazy_nr); - } if (nr || force_flush) flush_tlb_kernel_range(*start, *end); |