As a new comer to BPF ecosystem I was confused with Pseudo-C being the actual assembly. And while its obvious now that w and r forms represent 32-bit and 64-bit regs respectively, its better to call this out in documentation explicity and make it more newbie-proof. Signed-off-by: Vineet Gupta --- .../bpf/standardization/instruction-set.rst | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index 39c74611752b..fd688c5d3f04 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -315,13 +315,21 @@ For arithmetic and jump instructions (``ALU``, ``ALU64``, ``JMP`` and Arithmetic instructions ----------------------- -``ALU`` uses 32-bit wide operands while ``ALU64`` uses 64-bit wide operands for -otherwise identical operations. ``ALU64`` instructions belong to the -base64 conformance group unless noted otherwise. -The 'code' field encodes the operation as below, where 'src' refers to the +``ALU`` uses 32-bit wide operands ('w' registers in assembly) while +``ALU64`` uses 64-bit wide operands ('r' registers) for otherwise +identical operations. ``ALU64`` instructions belong to the base64 +conformance group unless noted otherwise. +The 'code' field encodes the operation as below, where 'src' refers to the source operand and 'dst' refers to the value of the destination register. +Note: BPF ISA is unique as it uses "Pseudo-C" notation for the assembly + instructions. In the table below, the column name actually specifies + the encodings. Assembly instructions (as generated by compilers) are + specified in the description column for some cases. Description of + ``?DIV``, ``?MOD`` includes additional logic part of semantics not + actual assembly. + .. table:: Arithmetic instructions ===== ===== ======= =================================================================================== -- 2.53.0 - Added some assembly (pseudo-C) snippets. - Rearrange: MOV content comes before MOVSX. - MOVSX content itself rearranged: canonical sign extension variant for {8,16,32}-> 64 moved ahead of the special variant which only sign extends to 32 and zeroes out the upper bits. - Remove the hyphen '-' in "sign-extension" to make grep hit all instances with one pattern. Signed-off-by: Vineet Gupta --- .../bpf/standardization/instruction-set.rst | 33 +++++++++++++------ 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index fd688c5d3f04..ac61d3be7af2 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -414,25 +414,38 @@ etc. This specification requires that signed modulo MUST use truncated division a % n = a - n * trunc(a / n) -The ``MOVSX`` instruction does a move operation with sign extension. -``{MOVSX, X, ALU}`` :term:`sign extends` 8-bit and 16-bit operands into -32-bit operands, and zeroes the remaining upper 32 bits. -``{MOVSX, X, ALU64}`` :term:`sign extends` 8-bit, 16-bit, and 32-bit -operands into 64-bit operands. Unlike other arithmetic instructions, -``MOVSX`` is only defined for register source operands (``X``). +For move operations, the ``MOV`` instruction has a few different forms. + +``{MOV, X, ALU64}`` means:: + + dst = src (e.g. r1 = r2) ``{MOV, K, ALU64}`` means:: dst = (s64)imm -``{MOV, X, ALU}`` means:: +e.g. r1 = -4; r5 = 9282009 + +``{MOV, X, ALU}`` has zero extension semantics (upper 32 bits are zeroed):: dst = (u32)src +e.g. w5 = w9 + +The ``MOVSX`` instruction does a move operation with sign extension. +``{MOVSX, X, ALU64}`` :term:`sign extends` 8-bit, 16-bit, and 32-bit +operands into 64-bit operands. + +The ``{MOVSX, X, ALU}`` form has slightly different semantics: it +:term:`sign extends` 8-bit and 16-bit operands into +32-bit operands, and zeroes the remaining upper 32 bits (similar to ``MOV``). + ``{MOVSX, X, ALU}`` with 'offset' 8 means:: dst = (u32)(s32)(s8)src +Unlike other arithmetic instructions, +``MOVSX`` is only defined for register source operands (``X``). The ``NEG`` instruction is only defined when the source bit is clear (``K``). @@ -605,7 +618,7 @@ For load and store instructions (``LD``, ``LDX``, ``ST``, and ``STX``), the ABS 1 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_ IND 2 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_ MEM 3 regular load and store operations `Regular load and store operations`_ - MEMSX 4 sign-extension load operations `Sign-extension load operations`_ + MEMSX 4 sign extension load operations `Sign extension load operations`_ ATOMIC 6 atomic operations `Atomic operations`_ ============= ===== ==================================== ============= @@ -649,10 +662,10 @@ instructions that transfer data between a register and memory. Where '' is one of: ``B``, ``H``, ``W``, or ``DW``, and 'unsigned size' is one of: u8, u16, u32, or u64. -Sign-extension load operations +Sign extension load operations ------------------------------ -The ``MEMSX`` mode modifier is used to encode :term:`sign-extension` load +The ``MEMSX`` mode modifier is used to encode :term:`sign extension` load instructions that transfer data between a register and memory. ``{MEMSX, , LDX}`` means:: -- 2.53.0