<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/mm, branch v6.8-rc3</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v6.8-rc3</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v6.8-rc3'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2024-01-30T01:12:16+00:00</updated>
<entry>
<title>Merge tag 'mm-hotfixes-stable-2024-01-28-23-21' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2024-01-30T01:12:16+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-30T01:12:16+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=6f3d7d5cedbad7a859bb34161a2807c58e6cdda8'/>
<id>urn:sha1:6f3d7d5cedbad7a859bb34161a2807c58e6cdda8</id>
<content type='text'>
Pull misc fixes from Andrew Morton:
 "22 hotfixes. 11 are cc:stable and the remainder address post-6.7
  issues or aren't considered appropriate for backporting"

* tag 'mm-hotfixes-stable-2024-01-28-23-21' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
  mm: thp_get_unmapped_area must honour topdown preference
  mm: huge_memory: don't force huge page alignment on 32 bit
  userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb
  selftests/mm: ksm_tests should only MADV_HUGEPAGE valid memory
  scs: add CONFIG_MMU dependency for vfree_atomic()
  mm/memory: fix folio_set_dirty() vs. folio_mark_dirty() in zap_pte_range()
  mm/huge_memory: fix folio_set_dirty() vs. folio_mark_dirty()
  selftests/mm: Update va_high_addr_switch.sh to check CPU for la57 flag
  selftests: mm: fix map_hugetlb failure on 64K page size systems
  MAINTAINERS: supplement of zswap maintainers update
  stackdepot: make fast paths lock-less again
  stackdepot: add stats counters exported via debugfs
  mm, kmsan: fix infinite recursion due to RCU critical section
  mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again
  selftests/mm: switch to bash from sh
  MAINTAINERS: add man-pages git trees
  mm: memcontrol: don't throttle dying tasks on memory.high
  mm: mmap: map MAP_STACK to VM_NOHUGEPAGE
  uprobes: use pagesize-aligned virtual address when replacing pages
  selftests/mm: mremap_test: fix build warning
  ...
</content>
</entry>
<entry>
<title>Merge tag 'fixes-2024-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock</title>
<updated>2024-01-28T17:41:39+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-28T17:41:39+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a08ebda97e2a32b8f1de2c8240337f726c2e1ba7'/>
<id>urn:sha1:a08ebda97e2a32b8f1de2c8240337f726c2e1ba7</id>
<content type='text'>
Pull memblock fix from Mike Rapoport:
 "Fix crash when reserved memory is not added to memory.

  When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, the initialization
  of reserved pages may cause access of NODE_DATA() with invalid nid and
  crash.

  Add a fall back to early_pfn_to_nid() in memmap_init_reserved_pages()
  to ensure a valid node id is always passed to init_reserved_page()"

* tag 'fixes-2024-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: fix crash when reserved memory is not added to memory
</content>
</entry>
<entry>
<title>mm: thp_get_unmapped_area must honour topdown preference</title>
<updated>2024-01-26T09:23:44+00:00</updated>
<author>
<name>Ryan Roberts</name>
<email>ryan.roberts@arm.com</email>
</author>
<published>2024-01-23T17:14:20+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=96204e15310c218fd9355bdcacd02fed1d18070e'/>
<id>urn:sha1:96204e15310c218fd9355bdcacd02fed1d18070e</id>
<content type='text'>
The addition of commit efa7df3e3bb5 ("mm: align larger anonymous mappings
on THP boundaries") caused the "virtual_address_range" mm selftest to
start failing on arm64.  Let's fix that regression.

There were 2 visible problems when running the test; 1) it takes much
longer to execute, and 2) the test fails.  Both are related:

The (first part of the) test allocates as many 1GB anonymous blocks as it
can in the low 256TB of address space, passing NULL as the addr hint to
mmap.  Before the faulty patch, all allocations were abutted and contained
in a single, merged VMA.  However, after this patch, each allocation is in
its own VMA, and there is a 2M gap between each VMA.  This causes the 2
problems in the test: 1) mmap becomes MUCH slower because there are so
many VMAs to check to find a new 1G gap.  2) mmap fails once it hits the
VMA limit (/proc/sys/vm/max_map_count).  Hitting this limit then causes a
subsequent calloc() to fail, which causes the test to fail.

The problem is that arm64 (unlike x86) selects
ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.  But __thp_get_unmapped_area()
allocates len+2M then always aligns to the bottom of the discovered gap. 
That causes the 2M hole.

Fix this by detecting cases where we can still achive the alignment goal
when moved to the top of the allocated area, if configured to prefer
top-down allocation.

While we are at it, fix thp_get_unmapped_area's use of pgoff, which should
always be zero for anonymous mappings.  Prior to the faulty change, while
it was possible for user space to pass in pgoff!=0, the old
mm-&gt;get_unmapped_area() handler would not use it.  thp_get_unmapped_area()
does use it, so let's explicitly zero it before calling the handler.  This
should also be the correct behavior for arches that define their own
get_unmapped_area() handler.

Link: https://lkml.kernel.org/r/20240123171420.3970220-1-ryan.roberts@arm.com
Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
Closes: https://lore.kernel.org/linux-mm/1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com/
Signed-off-by: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: huge_memory: don't force huge page alignment on 32 bit</title>
<updated>2024-01-26T07:52:21+00:00</updated>
<author>
<name>Yang Shi</name>
<email>yang@os.amperecomputing.com</email>
</author>
<published>2024-01-18T18:05:05+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=4ef9ad19e17676b9ef071309bc62020e2373705d'/>
<id>urn:sha1:4ef9ad19e17676b9ef071309bc62020e2373705d</id>
<content type='text'>
commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
boundaries") caused two issues [1] [2] reported on 32 bit system or compat
userspace.

It doesn't make too much sense to force huge page alignment on 32 bit
system due to the constrained virtual address space.

[1] https://lore.kernel.org/linux-mm/d0a136a0-4a31-46bc-adf4-2db109a61672@kernel.org/
[2] https://lore.kernel.org/linux-mm/CAJuCfpHXLdQy1a2B6xN2d7quTYwg2OoZseYPZTRpU0eHHKD-sQ@mail.gmail.com/

Link: https://lkml.kernel.org/r/20240118180505.2914778-1-shy828301@gmail.com
Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
Signed-off-by: Yang Shi &lt;yang@os.amperecomputing.com&gt;
Reported-by: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Reported-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Tested-by: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Tested-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Christopher Lameter &lt;cl@linux.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb</title>
<updated>2024-01-26T07:52:21+00:00</updated>
<author>
<name>Lokesh Gidra</name>
<email>lokeshgidra@google.com</email>
</author>
<published>2024-01-17T22:37:29+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=67695f18d55924b2013534ef3bdc363bc9e14605'/>
<id>urn:sha1:67695f18d55924b2013534ef3bdc363bc9e14605</id>
<content type='text'>
In mfill_atomic_hugetlb(), mmap_changing isn't being checked
again if we drop mmap_lock and reacquire it. When the lock is not held,
mmap_changing could have been incremented. This is also inconsistent
with the behavior in mfill_atomic().

Link: https://lkml.kernel.org/r/20240117223729.1444522-1-lokeshgidra@google.com
Fixes: df2cc96e77011 ("userfaultfd: prevent non-cooperative events vs mcopy_atomic races") 
Signed-off-by: Lokesh Gidra &lt;lokeshgidra@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Axel Rasmussen &lt;axelrasmussen@google.com&gt;
Cc: Brian Geffon &lt;bgeffon@google.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Nicolas Geoffray &lt;ngeoffray@google.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory: fix folio_set_dirty() vs. folio_mark_dirty() in zap_pte_range()</title>
<updated>2024-01-26T07:52:21+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2024-01-22T17:17:51+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=e4e3df290f65da6cb27dac1309389c916f27db1a'/>
<id>urn:sha1:e4e3df290f65da6cb27dac1309389c916f27db1a</id>
<content type='text'>
The correct folio replacement for "set_page_dirty()" is
"folio_mark_dirty()", not "folio_set_dirty()".  Using the latter won't
properly inform the FS using the dirty_folio() callback.

This has been found by code inspection, but likely this can result in some
real trouble when zapping dirty PTEs that point at clean pagecache folios.

Yuezhang Mo said: "Without this fix, testing the latest exfat with
xfstests, test cases generic/029 and generic/030 will fail."

Link: https://lkml.kernel.org/r/20240122171751.272074-1-david@redhat.com
Fixes: c46265030b0f ("mm/memory: page_remove_rmap() -&gt; folio_remove_rmap_pte()")
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Reported-by: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Closes: https://lkml.kernel.org/r/2445cedb-61fb-422c-8bfb-caf0a2beed62@arm.com
Reviewed-by: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: Yuezhang Mo &lt;Yuezhang.Mo@sony.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/huge_memory: fix folio_set_dirty() vs. folio_mark_dirty()</title>
<updated>2024-01-26T07:52:21+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2024-01-22T17:54:07+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=db44c658f798ad907219f15e033229b8d1aadb93'/>
<id>urn:sha1:db44c658f798ad907219f15e033229b8d1aadb93</id>
<content type='text'>
The correct folio replacement for "set_page_dirty()" is
"folio_mark_dirty()", not "folio_set_dirty()".  Using the latter won't
properly inform the FS using the dirty_folio() callback.

This has been found by code inspection, but likely this can result in some
real trouble.

Link: https://lkml.kernel.org/r/20240122175407.307992-1-david@redhat.com
Fixes: a8e61d584eda0 ("mm/huge_memory: page_remove_rmap() -&gt; folio_remove_rmap_pmd()")
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again</title>
<updated>2024-01-26T07:52:20+00:00</updated>
<author>
<name>Zach O'Keefe</name>
<email>zokeefe@google.com</email>
</author>
<published>2024-01-18T18:19:53+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=9319b647902cbd5cc884ac08a8a6d54ce111fc78'/>
<id>urn:sha1:9319b647902cbd5cc884ac08a8a6d54ce111fc78</id>
<content type='text'>
(struct dirty_throttle_control *)-&gt;thresh is an unsigned long, but is
passed as the u32 divisor argument to div_u64().  On architectures where
unsigned long is 64 bytes, the argument will be implicitly truncated.

Use div64_u64() instead of div_u64() so that the value used in the "is
this a safe division" check is the same as the divisor.

Also, remove redundant cast of the numerator to u64, as that should happen
implicitly.

This would be difficult to exploit in memcg domain, given the ratio-based
arithmetic domain_drity_limits() uses, but is much easier in global
writeback domain with a BDI_CAP_STRICTLIMIT-backing device, using e.g. 
vm.dirty_bytes=(1&lt;&lt;32)*PAGE_SIZE so that dtc-&gt;thresh == (1&lt;&lt;32)

Link: https://lkml.kernel.org/r/20240118181954.1415197-1-zokeefe@google.com
Fixes: f6789593d5ce ("mm/page-writeback.c: fix divide by zero in bdi_dirty_limits()")
Signed-off-by: Zach O'Keefe &lt;zokeefe@google.com&gt;
Cc: Maxim Patlasov &lt;MPatlasov@parallels.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memcontrol: don't throttle dying tasks on memory.high</title>
<updated>2024-01-26T07:52:20+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2024-01-11T13:29:02+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=63fd327016fdfca6f2fa27eba3496bd079eb8ed3'/>
<id>urn:sha1:63fd327016fdfca6f2fa27eba3496bd079eb8ed3</id>
<content type='text'>
While investigating hosts with high cgroup memory pressures, Tejun
found culprit zombie tasks that had were holding on to a lot of
memory, had SIGKILL pending, but were stuck in memory.high reclaim.

In the past, we used to always force-charge allocations from tasks
that were exiting in order to accelerate them dying and freeing up
their rss. This changed for memory.max in a4ebf1b6ca1e ("memcg:
prohibit unconditional exceeding the limit of dying tasks"); it noted
that this can cause (userspace inducable) containment failures, so it
added a mandatory reclaim and OOM kill cycle before forcing charges.
At the time, memory.high enforcement was handled in the userspace
return path, which isn't reached by dying tasks, and so memory.high
was still never enforced by dying tasks.

When c9afe31ec443 ("memcg: synchronously enforce memory.high for large
overcharges") added synchronous reclaim for memory.high, it added
unconditional memory.high enforcement for dying tasks as well. The
callstack shows that this path is where the zombie is stuck in.

We need to accelerate dying tasks getting past memory.high, but we
cannot do it quite the same way as we do for memory.max: memory.max is
enforced strictly, and tasks aren't allowed to move past it without
FIRST reclaiming and OOM killing if necessary. This ensures very small
levels of excess. With memory.high, though, enforcement happens lazily
after the charge, and OOM killing is never triggered. A lot of
concurrent threads could have pushed, or could actively be pushing,
the cgroup into excess. The dying task will enter reclaim on every
allocation attempt, with little hope of restoring balance.

To fix this, skip synchronous memory.high enforcement on dying tasks
altogether again. Update memory.high path documentation while at it.

[hannes@cmpxchg.org: also handle tasks are being killed during the reclaim]
  Link: https://lkml.kernel.org/r/20240111192807.GA424308@cmpxchg.org
Link: https://lkml.kernel.org/r/20240111132902.389862-1-hannes@cmpxchg.org
Fixes: c9afe31ec443 ("memcg: synchronously enforce memory.high for large overcharges")
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Tejun Heo &lt;tj@kernel.org&gt;
Reviewed-by: Yosry Ahmed &lt;yosryahmed@google.com&gt;
Acked-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Dan Schatzberg &lt;schatzberg.dan@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/hugetlbfs/inode.c: mm/memory-failure.c: fix hugetlbfs hwpoison handling</title>
<updated>2024-01-26T07:52:20+00:00</updated>
<author>
<name>Sidhartha Kumar</name>
<email>sidhartha.kumar@oracle.com</email>
</author>
<published>2024-01-12T18:08:40+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=19d3e221807772f8443e565234a6fdc5a2b09d26'/>
<id>urn:sha1:19d3e221807772f8443e565234a6fdc5a2b09d26</id>
<content type='text'>
has_extra_refcount() makes the assumption that the page cache adds a ref
count of 1 and subtracts this in the extra_pins case.  Commit a08c7193e4f1
(mm/filemap: remove hugetlb special casing in filemap.c) modifies
__filemap_add_folio() by calling folio_ref_add(folio, nr); for all cases
(including hugtetlb) where nr is the number of pages in the folio.  We
should adjust the number of references coming from the page cache by
subtracing the number of pages rather than 1.

In hugetlbfs_read_iter(), folio_test_has_hwpoisoned() is testing the wrong
flag as, in the hugetlb case, memory-failure code calls
folio_test_set_hwpoison() to indicate poison.  folio_test_hwpoison() is
the correct function to test for that flag.

After these fixes, the hugetlb hwpoison read selftest passes all cases.

Link: https://lkml.kernel.org/r/20240112180840.367006-1-sidhartha.kumar@oracle.com
Fixes: a08c7193e4f1 ("mm/filemap: remove hugetlb special casing in filemap.c")
Signed-off-by: Sidhartha Kumar &lt;sidhartha.kumar@oracle.com&gt;
Closes: https://lore.kernel.org/linux-mm/20230713001833.3778937-1-jiaqiyan@google.com/T/#m8e1469119e5b831bbd05d495f96b842e4a1c5519
Reported-by: Muhammad Usama Anjum &lt;usama.anjum@collabora.com&gt;
Tested-by: Muhammad Usama Anjum &lt;usama.anjum@collabora.com&gt;
Acked-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Acked-by: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: James Houghton &lt;jthoughton@google.com&gt;
Cc: Jiaqi Yan &lt;jiaqiyan@google.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[6.7+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
