summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
diff options
context:
space:
mode:
authorNathan Lynch <nathanl@linux.ibm.com>2023-02-10 12:42:00 -0600
committerMichael Ellerman <mpe@ellerman.id.au>2023-02-13 22:35:02 +1100
commit43033bc62d349d8d852855a336c91d046de819bd (patch)
tree9f7ce98c09f955129e605446b45fedf442da0e20 /arch/powerpc/kernel
parent24098f580e2b5ceb2cec4f02833e0a0bb5d46d2e (diff)
downloadlwn-43033bc62d349d8d852855a336c91d046de819bd.tar.gz
lwn-43033bc62d349d8d852855a336c91d046de819bd.zip
powerpc/pseries: add RTAS work area allocator
Various pseries-specific RTAS functions take a temporary "work area" parameter - a buffer in memory accessible to RTAS. Typically such functions are passed the statically allocated rtas_data_buf buffer as the argument. This buffer is protected by a global spinlock. So users of rtas_data_buf cannot perform sleeping operations while accessing the buffer. Most RTAS functions that have a work area parameter can return a status (-2/990x) that indicates that the caller should retry. Before retrying, the caller may need to reschedule or sleep (see rtas_busy_delay() for details). This combination of factors leads to uncomfortable constructions like this: do { spin_lock(&rtas_data_buf_lock); rc = rtas_call(token, __pa(rtas_data_buf, ...); if (rc == 0) { /* parse or copy out rtas_data_buf contents */ } spin_unlock(&rtas_data_buf_lock); } while (rtas_busy_delay(rc)); Another unfortunately common way of handling this is for callers to blithely ignore the possibility of a -2/990x status and hope for the best. If users were allowed to perform blocking operations while owning a work area, the programming model would become less tedious and error-prone. Users could schedule away, sleep, or perform other blocking operations without having to release and re-acquire resources. We could continue to use a single work area buffer, and convert rtas_data_buf_lock to a mutex. But that would impose an unnecessarily coarse serialization on all users. As awkward as the current design is, it prevents longer running operations that need to repeatedly use rtas_data_buf from blocking the progress of others. There are more considerations. One is that while 4KB is fine for all current in-kernel uses, some RTAS calls can take much smaller buffers, and some (VPD, platform dumps) would likely benefit from larger ones. Another is that at least one RTAS function (ibm,get-vpd) has *two* work area parameters. And finally, we should expect the number of work area users in the kernel to increase over time as we introduce lockdown-compatible ABIs to replace less safe use cases based on sys_rtas/librtas. So a special-purpose allocator for RTAS work area buffers seems worth trying. Properties: * The backing memory for the allocator is reserved early in boot in order to satisfy RTAS addressing requirements, and then managed with genalloc. * Allocations can block, but they never fail (mempool-like). * Prioritizes first-come, first-serve fairness over throughput. * Early boot allocations before the allocator has been initialized are served via an internal static buffer. Intended to replace rtas_data_buf. New code that needs RTAS work area buffers should prefer this API. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
Diffstat (limited to 'arch/powerpc/kernel')
-rw-r--r--arch/powerpc/kernel/rtas.c3
1 files changed, 3 insertions, 0 deletions
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index a6d0c46b4512..1f80f0b8a9ad 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -36,6 +36,7 @@
#include <asm/machdep.h>
#include <asm/mmu.h>
#include <asm/page.h>
+#include <asm/rtas-work-area.h>
#include <asm/rtas.h>
#include <asm/time.h>
#include <asm/trace.h>
@@ -1939,6 +1940,8 @@ void __init rtas_initialize(void)
#endif
ibm_open_errinjct_token = rtas_token("ibm,open-errinjct");
ibm_errinjct_token = rtas_token("ibm,errinjct");
+
+ rtas_work_area_reserve_arena(rtas_region);
}
int __init early_init_dt_scan_rtas(unsigned long node,