diff options
Diffstat (limited to 'tools/net/sunrpc/xdrgen/README')
-rw-r--r-- | tools/net/sunrpc/xdrgen/README | 244 |
1 files changed, 244 insertions, 0 deletions
diff --git a/tools/net/sunrpc/xdrgen/README b/tools/net/sunrpc/xdrgen/README new file mode 100644 index 000000000000..92f7738ad50c --- /dev/null +++ b/tools/net/sunrpc/xdrgen/README @@ -0,0 +1,244 @@ +xdrgen - Linux Kernel XDR code generator + +Introduction +------------ + +SunRPC programs are typically specified using a language defined by +RFC 4506. In fact, all IETF-published NFS specifications provide a +description of the specified protocol using this language. + +Since the 1990's, user space consumers of SunRPC have had access to +a tool that could read such XDR specifications and then generate C +code that implements the RPC portions of that protocol. This tool is +called rpcgen. + +This RPC-level code is code that handles input directly from the +network, and thus a high degree of memory safety and sanity checking +is needed to help ensure proper levels of security. Bugs in this +code can have significant impact on security and performance. + +However, it is code that is repetitive and tedious to write by hand. + +The C code generated by rpcgen makes extensive use of the facilities +of the user space TI-RPC library and libc. Furthermore, the dialect +of the generated code is very traditional K&R C. + +The Linux kernel's implementation of SunRPC-based protocols hand-roll +their XDR implementation. There are two main reasons for this: + +1. libtirpc (and its predecessors) operate only in user space. The + kernel's RPC implementation and its API are significantly + different than libtirpc. + +2. rpcgen-generated code is believed to be less efficient than code + that is hand-written. + +These days, gcc and its kin are capable of optimizing code better +than human authors. There are only a few instances where writing +XDR code by hand will make a measurable performance different. + +In addition, the current hand-written code in the Linux kernel is +difficult to audit and prove that it implements exactly what is in +the protocol specification. + +In order to accrue the benefits of machine-generated XDR code in the +kernel, a tool is needed that will output C code that works against +the kernel's SunRPC implementation rather than libtirpc. + +Enter xdrgen. + + +Dependencies +------------ + +These dependencies are typically packaged by Linux distributions: + +- python3 +- python3-lark +- python3-jinja2 + +These dependencies are available via PyPi: + +- pip install 'lark[interegular]' + + +XDR Specifications +------------------ + +When adding a new protocol implementation to the kernel, the XDR +specification can be derived by feeding a .txt copy of the RFC to +the script located in tools/net/sunrpc/extract.sh. + + $ extract.sh < rfc0001.txt > new2.x + + +Operation +--------- + +Once a .x file is available, use xdrgen to generate source and +header files containing an implementation of XDR encoding and +decoding functions for the specified protocol. + + $ ./xdrgen definitions new2.x > include/linux/sunrpc/xdrgen/new2.h + $ ./xdrgen declarations new2.x > new2xdr_gen.h + +and + + $ ./xdrgen source new2.x > new2xdr_gen.c + +The files are ready to use for a server-side protocol implementation, +or may be used as a guide for implementing these routines by hand. + +By default, the only comments added to this code are kdoc comments +that appear directly in front of the public per-procedure APIs. For +deeper introspection, specifying the "--annotate" flag will insert +additional comments in the generated code to help readers match the +generated code to specific parts of the XDR specification. + +Because the generated code is targeted for the Linux kernel, it +is tagged with a GPLv2-only license. + +The xdrgen tool can also provide lexical and syntax checking of +an XDR specification: + + $ ./xdrgen lint xdr/new.x + + +How It Works +------------ + +xdrgen does not use machine learning to generate source code. The +translation is entirely deterministic. + +RFC 4506 Section 6 contains a BNF grammar of the XDR specification +language. The grammar has been adapted for use by the Python Lark +module. + +The xdr.ebnf file in this directory contains the grammar used to +parse XDR specifications. xdrgen configures Lark using the grammar +in xdr.ebnf. Lark parses the target XDR specification using this +grammar, creating a parse tree. + +xdrgen then transforms the parse tree into an abstract syntax tree. +This tree is passed to a series of code generators. + +The generators are implemented as Python classes residing in the +generators/ directory. Each generator emits code created from Jinja2 +templates stored in the templates/ directory. + +The source code is generated in the same order in which they appear +in the specification to ensure the generated code compiles. This +conforms with the behavior of rpcgen. + +xdrgen assumes that the generated source code is further compiled by +a compiler that can optimize in a number of ways, including: + + - Unused functions are discarded (ie, not added to the executable) + + - Aggressive function inlining removes unnecessary stack frames + + - Single-arm switch statements are replaced by a single conditional + branch + +And so on. + + +Pragmas +------- + +Pragma directives specify exceptions to the normal generation of +encoding and decoding functions. Currently one directive is +implemented: "public". + +Pragma exclude +------ ------- + + pragma exclude <RPC procedure> ; + +In some cases, a procedure encoder or decoder function might need +special processing that cannot be automatically generated. The +automatically-generated functions might conflict or interfere with +the hand-rolled function. To avoid editing the generated source code +by hand, a pragma can specify that the procedure's encoder and +decoder functions are not included in the generated header and +source. + +For example: + + pragma exclude NFSPROC3_READDIRPLUS; + +Excludes the decoder function for the READDIRPLUS argument and the +encoder function for the READDIRPLUS result. + +Note that because data item encoder and decoder functions are +defined "static __maybe_unused", subsequent compilation +automatically excludes data item encoder and decoder functions that +are used only by excluded procedure. + +Pragma header +------ ------ + + pragma header <string> ; + +Provide a name to use for the header file. For example: + + pragma header nlm4; + +Adds + + #include "nlm4xdr_gen.h" + +to the generated source file. + +Pragma public +------ ------ + + pragma public <XDR data item> ; + +Normally XDR encoder and decoder functions are "static". In case an +implementer wants to call these functions from other source code, +s/he can add a public pragma in the input .x file to indicate a set +of functions that should get a prototype in the generated header, +and the function definitions will not be declared static. + +For example: + + pragma public nfsstat3; + +Adds these prototypes in the generated header: + + bool xdrgen_decode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 *ptr); + bool xdrgen_encode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 value); + +And, in the generated source code, both of these functions appear +without the "static __maybe_unused" modifiers. + + +Future Work +----------- + +Finish implementing XDR pointer and list types. + +Generate client-side procedure functions + +Expand the README into a user guide similar to rpcgen(1) + +Add more pragma directives: + + * @pages -- use xdr_read/write_pages() for the specified opaque + field + * @skip -- do not decode, but rather skip, the specified argument + field + +Enable something like a #include to dynamically insert the content +of other specification files + +Properly support line-by-line pass-through via the "%" decorator + +Build a unit test suite for verifying translation of XDR language +into compilable code + +Add a command-line option to insert trace_printk call sites in the +generated source code, for improved (temporary) observability + +Generate kernel Rust code as well as C code |