mm: fix unmap_mapping_range high bits shift bug - lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

diff options

author	Jiajun Xie <jiajun.xie.sh@gmail.com>	2023-12-20 13:28:39 +0800
committer	Andrew Morton <akpm@linux-foundation.org>	2023-12-29 11:06:48 -0800
commit	9eab0421fa94a3dde0d1f7e36ab3294fc306c99d (patch)
tree	7a4483317f2f27c5c3a39ac55597f78ca9497121 /CREDITS
parent	9bcef5973e31020e5aa8571eb994d67b77318356 (diff)
download	lwn-9eab0421fa94a3dde0d1f7e36ab3294fc306c99d.tar.gz lwn-9eab0421fa94a3dde0d1f7e36ab3294fc306c99d.zip

mm: fix unmap_mapping_range high bits shift bug

The bug happens when highest bit of holebegin is 1, suppose holebegin is 0x8000000111111000, after shift, hba would be 0xfff8000000111111, then vma_interval_tree_foreach would look it up fail or leads to the wrong result. error call seq e.g.: - mmap(..., offset=0x8000000111111000) |- syscall(mmap, ... unsigned long, off): |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT); here pgoff is correctly shifted to 0x8000000111111, but pass 0x8000000111111000 as holebegin to unmap would then cause terrible result, as shown below: - unmap_mapping_range(..., loff_t const holebegin) |- pgoff_t hba = holebegin >> PAGE_SHIFT; /* hba = 0xfff8000000111111 unexpectedly */ The issue happens in Heterogeneous computing, where the device(e.g. gpu) and host share the same virtual address space. A simple workflow pattern which hit the issue is: /* host */ 1. userspace first mmap a file backed VA range with specified offset. e.g. (offset=0x800..., mmap return: va_a) 2. write some data to the corresponding sys page e.g. (va_a = 0xAABB) /* device */ 3. gpu workload touches VA, triggers gpu fault and notify the host. /* host */ 4. reviced gpu fault notification, then it will: 4.1 unmap host pages and also takes care of cpu tlb (use unmap_mapping_range with offset=0x800...) 4.2 migrate sys page to device 4.3 setup device page table and resolve device fault. /* device */ 5. gpu workload continued, it accessed va_a and got 0xAABB. 6. gpu workload continued, it wrote 0xBBCC to va_a. /* host */ 7. userspace access va_a, as expected, it will: 7.1 trigger cpu vm fault. 7.2 driver handling fault to migrate gpu local page to host. 8. userspace then could correctly get 0xBBCC from va_a 9. done But in step 4.1, if we hit the bug this patch mentioned, then userspace would never trigger cpu fault, and still get the old value: 0xAABB. Making holebegin unsigned first fixes the bug. Link: https://lkml.kernel.org/r/20231220052839.26970-1-jiajun.xie.sh@gmail.com Signed-off-by: Jiajun Xie <jiajun.xie.sh@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Diffstat (limited to 'CREDITS')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: