summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide/cgroup-v2.rst
diff options
context:
space:
mode:
authorRoman Gushchin <guro@fb.com>2018-06-07 17:07:46 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2018-06-07 17:34:36 -0700
commitbf8d5d52ffe89aac5b46ddb39dd1a4351fae5df4 (patch)
treee0b0457ddf128b0562eb403762b2f2de2292e8b1 /Documentation/admin-guide/cgroup-v2.rst
parentfb52bbaee598f58352d8732637ebe7013b2df79f (diff)
downloadlwn-bf8d5d52ffe89aac5b46ddb39dd1a4351fae5df4.tar.gz
lwn-bf8d5d52ffe89aac5b46ddb39dd1a4351fae5df4.zip
memcg: introduce memory.min
Memory controller implements the memory.low best-effort memory protection mechanism, which works perfectly in many cases and allows protecting working sets of important workloads from sudden reclaim. But its semantics has a significant limitation: it works only as long as there is a supply of reclaimable memory. This makes it pretty useless against any sort of slow memory leaks or memory usage increases. This is especially true for swapless systems. If swap is enabled, memory soft protection effectively postpones problems, allowing a leaking application to fill all swap area, which makes no sense. The only effective way to guarantee the memory protection in this case is to invoke the OOM killer. It's possible to handle this case in userspace by reacting on MEMCG_LOW events; but there is still a place for a fail-safe in-kernel mechanism to provide stronger guarantees. This patch introduces the memory.min interface for cgroup v2 memory controller. It works very similarly to memory.low (sharing the same hierarchical behavior), except that it's not disabled if there is no more reclaimable memory in the system. If cgroup is not populated, its memory.min is ignored, because otherwise even the OOM killer wouldn't be able to reclaim the protected memory, and the system can stall. [guro@fb.com: s/low/min/ in docs] Link: http://lkml.kernel.org/r/20180510130758.GA9129@castle.DHCP.thefacebook.com Link: http://lkml.kernel.org/r/20180509180734.GA4856@castle.DHCP.thefacebook.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation/admin-guide/cgroup-v2.rst')
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst27
1 files changed, 25 insertions, 2 deletions
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 7b56ca80e37a..e34d3c938729 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1001,6 +1001,29 @@ PAGE_SIZE multiple when read back.
The total amount of memory currently being used by the cgroup
and its descendants.
+ memory.min
+ A read-write single value file which exists on non-root
+ cgroups. The default is "0".
+
+ Hard memory protection. If the memory usage of a cgroup
+ is within its effective min boundary, the cgroup's memory
+ won't be reclaimed under any conditions. If there is no
+ unprotected reclaimable memory available, OOM killer
+ is invoked.
+
+ Effective min boundary is limited by memory.min values of
+ all ancestor cgroups. If there is memory.min overcommitment
+ (child cgroup or cgroups are requiring more protected memory
+ than parent will allow), then each child cgroup will get
+ the part of parent's protection proportional to its
+ actual memory usage below memory.min.
+
+ Putting more memory than generally available under this
+ protection is discouraged and may lead to constant OOMs.
+
+ If a memory cgroup is not populated with processes,
+ its memory.min is ignored.
+
memory.low
A read-write single value file which exists on non-root
cgroups. The default is "0".
@@ -1012,9 +1035,9 @@ PAGE_SIZE multiple when read back.
Effective low boundary is limited by memory.low values of
all ancestor cgroups. If there is memory.low overcommitment
- (child cgroup or cgroups are requiring more protected memory,
+ (child cgroup or cgroups are requiring more protected memory
than parent will allow), then each child cgroup will get
- the part of parent's protection proportional to the its
+ the part of parent's protection proportional to its
actual memory usage below memory.low.
Putting more memory than generally available under this