summaryrefslogtreecommitdiff
path: root/mm/swap_cgroup.c
diff options
context:
space:
mode:
authorJohannes Weiner <hannes@cmpxchg.org>2020-06-03 16:02:14 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2020-06-03 20:09:48 -0700
commit2d1c498072de69e2857b849ee197ba2aa7de53a3 (patch)
tree3da860d08fe32b893e4f3946c862381aa9a06469 /mm/swap_cgroup.c
parenteccb52e7880973f221ab2606e4d22ce04d96a1a9 (diff)
downloadlwn-2d1c498072de69e2857b849ee197ba2aa7de53a3.tar.gz
lwn-2d1c498072de69e2857b849ee197ba2aa7de53a3.zip
mm: memcontrol: make swap tracking an integral part of memory control
Without swap page tracking, users that are otherwise memory controlled can easily escape their containment and allocate significant amounts of memory that they're not being charged for. That's because swap does readahead, but without the cgroup records of who owned the page at swapout, readahead pages don't get charged until somebody actually faults them into their page table and we can identify an owner task. This can be maliciously exploited with MADV_WILLNEED, which triggers arbitrary readahead allocations without charging the pages. Make swap swap page tracking an integral part of memcg and remove the Kconfig options. In the first place, it was only made configurable to allow users to save some memory. But the overhead of tracking cgroup ownership per swap page is minimal - 2 byte per page, or 512k per 1G of swap, or 0.04%. Saving that at the expense of broken containment semantics is not something we should present as a coequal option. The swapaccount=0 boot option will continue to exist, and it will eliminate the page_counter overhead and hide the swap control files, but it won't disable swap slot ownership tracking. This patch makes sure we always have the cgroup records at swapin time; the next patch will fix the actual bug by charging readahead swap pages at swapin time rather than at fault time. v2: fix double swap charge bug in cgroup1/cgroup2 code gating [hannes@cmpxchg.org: fix crash with cgroup_disable=memory] Link: http://lkml.kernel.org/r/20200521215855.GB815153@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Link: http://lkml.kernel.org/r/20200508183105.225460-16-hannes@cmpxchg.org Debugged-by: Hugh Dickins <hughd@google.com> Debugged-by: Michal Hocko <mhocko@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/swap_cgroup.c')
-rw-r--r--mm/swap_cgroup.c6
1 files changed, 0 insertions, 6 deletions
diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c
index 7aa764f09079..7f34343c075a 100644
--- a/mm/swap_cgroup.c
+++ b/mm/swap_cgroup.c
@@ -171,9 +171,6 @@ int swap_cgroup_swapon(int type, unsigned long max_pages)
unsigned long length;
struct swap_cgroup_ctrl *ctrl;
- if (cgroup_memory_noswap)
- return 0;
-
length = DIV_ROUND_UP(max_pages, SC_PER_PAGE);
array_size = length * sizeof(void *);
@@ -209,9 +206,6 @@ void swap_cgroup_swapoff(int type)
unsigned long i, length;
struct swap_cgroup_ctrl *ctrl;
- if (cgroup_memory_noswap)
- return;
-
mutex_lock(&swap_cgroup_mutex);
ctrl = &swap_cgroup_ctrl[type];
map = ctrl->map;