diff options
| author | Tejun Heo <tj@kernel.org> | 2026-06-01 08:37:28 -1000 |
|---|---|---|
| committer | Alexei Starovoitov <ast@kernel.org> | 2026-06-05 08:22:36 -0700 |
| commit | f64c723741c911544cca4c838d7a291b06b3ad1d (patch) | |
| tree | a9c1224ebd7b471b2b8368216f6067ba56abc483 /include/linux | |
| parent | aa496720618f1a6054f1c870bf10b4f6c99bf656 (diff) | |
| download | lwn-f64c723741c911544cca4c838d7a291b06b3ad1d.tar.gz lwn-f64c723741c911544cca4c838d7a291b06b3ad1d.zip | |
bpf: Replace scratch PTE atomically when allocating arena pages
apply_range_set_cb() maps the pages for a new arena allocation and returned
-EBUSY when the target PTE was already populated. Kernel-fault recovery
leaves the per-arena scratch page in unallocated arena PTEs, so a later
bpf_arena_alloc_pages() over such a page hits that -EBUSY, and every
subsequent allocation of it fails the same way. Allocation must install the
real page over scratch instead.
Overwriting the scratch PTE in place is a valid->valid change, which arm64
forbids without break-before-make. Route through an invalid entry instead:
ptep_try_set() fills only a none slot, so the PTE goes scratch->none->page.
On finding scratch, clear it and flush_tlb_before_set() before retrying. The
new flush_tlb_before_set() is a no-op except on arches like arm64 that need
the break-before-make TLB invalidate. The loop also copes with a concurrent
fault re-scratching the slot.
Arches without ptep_try_set() never install the scratch page, so keep the
must-be-empty check and set_pte_at() for them.
Fixes: dc11a4dba246 ("bpf: Recover arena kernel faults with scratch page")
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260601183728.1800490-1-tj@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/pgtable.h | 18 |
1 files changed, 18 insertions, 0 deletions
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index b5739bb99fc1..4c6c4081ef71 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1061,6 +1061,24 @@ static inline bool ptep_try_set(pte_t *ptep, pte_t new_pte) } #endif +#ifndef flush_tlb_before_set +/** + * flush_tlb_before_set - invalidate a kernel PTE's TLB before re-setting it + * @addr: kernel virtual address whose PTE was just cleared + * + * Some architectures (e.g. arm64) do not allow a live page-table entry to be + * repointed at a different page in one step. The old entry must first be made + * invalid and its translation flushed from every TLB, and only then may the new + * entry be written. + * + * This is only for the lockless atomic kernel-PTE installers (ptep_try_set()). + * It must be callable with interrupts disabled. + */ +static inline void flush_tlb_before_set(unsigned long addr) +{ +} +#endif + #ifndef wrprotect_ptes /** * wrprotect_ptes - Write-protect PTEs that map consecutive pages of the same |
