NAME
CPU::x86_64::InstructionWriter - Assemble x86-64 instructions using a pure-perl API
VERSION
version 0.001
SYNOPSIS
# POSIX::exit(42);
my $machine_code= CPU::x86_64::InstructionWriter->new
->mov( 'RAX', 60 )
->mov( 'RDI', 42 )
->syscall()
->bytes;
# if (x == 1) { ++x } else { ++y }
my $machine_code= CPU::x86_64::InstructionWriter->new
->cmp( 'RAX', 1 )
->jne('else') # jump to not-yet-defined label named 'else'
->inc( 'RAX' )
->jmp('end') # jump to another not-yet-defined label
->label('else') # resolve previous jump to this address
->inc( 'RCX' )
->label('end') # resolve second jump to this address
->bytes;
DESCRIPTION
This module is an early stage of development and the API is not finalized.
The purpose of this module is to relatively efficiently assemble instructions for the x86-64 without generating and re-parsing assembly language, or shelling out to an external tool. All instructions are assumed to be for the 64-bit mode of the processor. Functionality for real mode or segmented 16-bit mode could be added by a yet-to-be-written ::x86 module.
This module consists of a bunch of chainable methods which build a string of machine code as you call them. It supports lazy-resolved jump labels, and lazy-bound constants which can be assigned a value after the instructions have been assembled.
Note: This module currently requires a perl with 64-bit integers and pack('Q')
support.
NOTATIONS
The method names of this class loosely match the NASM notation, but with the addition of the number of data bits following the opcode name, and list of arguments.
MOV EAX, [EBX]
$w->mov32_reg_mem('eax', ['ebx']);
# or, short form
use CPU::X86_64::InstructionWriter ':registers';
$w->mov(eax,[ebx]);
Using a specific method like 'mov32_reg_mem' runs faster than the generic method 'mov', and removes ambiguity since your code generator probably already knows what operation it wants. Also it removes the need for the "qword" attributes that NASM sometimes needs. However, if you want you can use the generic method for an op.
There are often entirely new names given to an opcode (for the somewhat obscure ones) but the official Intel/AMD name is provided as an alias.
CMP EAX EBX
JNO label ; quick, what does JNO mean?
$w->cmp32_reg_reg('eax','ebx')->jmp_unless_overflow($label);
# or:
$w->cmp(eax,ebx)->jno("mylabel");
MEMORY LOCATIONS
Most instructions in the x86 set allow for one argument to be a memory location, composed of
- A base register
- plus a constant displacement (usually limited to 32-bit)
- plus an index register times a scale of 1, 2, 4, or 8
[ $base, $displacement, $index, $scale ]
Leave a slot in the array undef
to skip it. (but obviously one of them must be set) You may also allocate a smaller array to imply the remeaining items are undef.
Examples:
['rdx'] # address RDX
['rbx', -20000] # address RBX-20000
[undef, 0x7FFFFFFF] # address 0x7FFFFFFF
[undef, undef, 'ecx', 8] # address ECX*8
NASM supports scales like [EAX*5] by silently converting that to [EAX+EAX*4], but this module does not support that via the scale field. (it would just slow things down for a feature nobody uses)
ATTRIBUTES
start_address
You might or might not need to set this. Some instructions care about what address they live at for things like RIP-relative addressing. The default value is an object of class "unknown". Things that depend on it will also be represented by "unknown" until the start_address has been given a value. If you try to resolve them numerically before start_address is set, you get an exception.
labels
This is a set of all labels currently relevant to this writer, indexed by name (so names must be unique). You probably don't need to access this. See "get_label" and "mark".
METHODS
get_label
my $label= $writer->get_label($name); # label-by-name, created on demand
my $label= $writer->get_label(); # new anonymous label
Return a label object for the given name, or if no name is given, return an anonymous label.
The label objects returned can be assigned a location within the instruction stream using "mark" and used as the target for JMP
and JMP
-like instructions. A label can also be used as a constant once all variable-length instructions have been "resolve"d and once "start_address" is defined.
label
->label($label_ref) # bind label object to current position
->label(my $new_label) # like above, but create anonymous label object and assign to $new_label
->label($label_name) # like above, but create/lookup label object by name
Bind a named label to the current position in the instruction buffer. You can also pass a label reference from "get_label", or an undef variable which will be assigned a label.
If the current position follows instructions of unknown length, the label will be processed as an unknown, and shift automatically as the instructions are resolved.
bytes
Return the assembled instructions as a string of bytes. This will fail if any of the labels were left un-marked or if any expressions can't be evaluated.
DATA DECLARATION
This class assembles instructions, but sometimes you want to mix in data, and label the data. These methods append data, optionally aligned.
data
Append a string of literal bytes to the instruction stream.
data_i8, data_i16, data_i32, data_i64
Pack an integer into some number of bits and append it.
data_f32, data_f64
Pack a floating point number into the given bit-length (float or double) and append it.
align, align16, align32, align64, align128
Append zero or more bytes so that the next instruction is aligned in memory. By default, the fill-byte will be a NO-OP (0x90). You can override it with your choice.
INSTRUCTIONS
The following methods append an instruction to the buffer, and return $self
so you can continue calling instructions in a chain.
NOP, PAUSE
Insert one or more no-op instructions.
- nop(),
nop( $n )
-
If called without an argument, insert one no-op. Else insert
$n
no-ops. - pause(),
pause( $n )
-
Like NOP, but hints to the processor that the program is in a spin-loop so it has the opportunity to reduce power consumption. This is a 2-byte instruction.
CALL
call_label( $label )
-
Call to subroutine at named label, relative to current RIP. This method takes a label and calculates a
call_rel( $ofs )
for you. call_rel( $offset )
-
Call to subroutine at signed 32-bit offset from current RIP.
call_abs_reg( $reg )
-
Call to subroutine at absolute address stored in 64-bit register.
call_abs_mem( \@mem )
-
Call to subroutine at absolute address stored at "memory location"
RET
->ret
->ret($pop_bytes) # 16-bit number of bytes to discard from stack
JMP
All jump instructions are relative, and take either a numeric offset (from the start of the next instruction) or a label, except the jmp_abs_reg
instruction which takes a register containing the target address, and the jmp_abs_mem
which reads a memory address for the address to jump to.
If you pass an undefined variable as a label it will be auto-populated with a label object. Otherwise the label should be a string (label name) or label object obtained from "get_label".
jmp($label)
-
Unconditional jump to label (or 32-bit offset constant).
jmp_abs_reg($reg)
-
Jump to the absolute address contained in a register.
jmp_abs_mem(\@mem)
-
Jump to the absolute address read from a "memory location"
jmp_if_eq
,je
,jz
jmp_if_ne
,jne
,jnz
-
Jump to label if zero flag is/isn't set after CMP instruction
jmp_if_unsigned_lt
,jb
,jmp_if_carry
,jc
jmp_if_unsigned_gt
,ja
jmp_if_unsigned_le
,jbe
jmp_if_unsigned_ge
,jae
,jmp_unless_carry
,jnc
-
Jump to label if unsigned less-than / greater-than / less-or-equal / greater-or-equal
jmp_if_signed_lt
,jl
jmp_if_signed_gt
,jg
jmp_if_signed_le
,jle
jmp_if_signed_ge
,jge
-
Jump to label if signed less-than / greater-than / less-or-equal / greater-or-equal
jmp_if_sign
,js
jmp_unless_sign
,jns
-
Jump to label if 'sign' flag is/isn't set after CMP instruction
jmp_if_overflow
,jo
jmp_unless_overflow
,jno
-
Jump to label if overflow flag is/isn't set after CMP instruction
jmp_if_parity_even
,jpe
,jp
jmp_if_parity_odd
,jpo
,jnp
-
Jump to label if 'parity' flag is/isn't set after CMP instruction
jmp_cx_zero
,jrcxz
-
Short-jump to label if RCX register is zero
loop
-
Decrement RCX and short-jump to label if RCX register is nonzero (decrement of RCX does not change rFLAGS)
loopz
,loope
-
Decrement RCX and short-jump to label if RCX register is nonzero and zero flag (ZF) is set. (decrement of RCX does not change rFLAGS)
loopnz
,loopne
-
Decrement RCX and short-jump to label if RCX register is nonzero and zero flag (ZF) is not set (decrement of RCX does not change rFLAGS)
MOV
mov($dest, $src, $bits)
-
Generic top-level instruction method that dispatches to more specific versions of mov based on the arguments you gave it. The third argument is optional if one of the other arguments is a register.
mov64_reg_reg($dest_reg, $src_reg)
-
Copy second register to first register. Copies full 64-bit value.
mov##_mem_reg($mem, $reg)
-
Store ##-bit value in register to a "memory location". If the memory location consists of a single displacement greater than 32 bits, the register must be the appropriate size accumulator (RAX, EAX, AX, or AL)
mov##_reg_mem($reg, $mem)
-
Load ##-bit value at "memory location" into register. The Displacement portion of the memory location must normally be 32-bit, but as a special case you can load a full 64-bit displacement (with no register offset) into the Accumulator register of that size (RAX, EAX, AX, or AL).
$asm->mov8_reg_mem ( 'al', [ undef, 0xFF00FF00FF00FF00FF00 ]); $asm->mov64_reg_mem('rax', [ undef, 0xFF00FF00FF00FF00FF00 ]);
mov64_reg_imm($dest_reg, $constant)
-
Load a constant value into a 64-bit register. Constant is sign-extended to 64-bits. Constant may be an expression.
mov##_mem_imm($mem, $constant)
-
Store a constant value into a ##-bit memory location. For mov64, constant is 32-bit sign-extended to 64-bits. Constant may be an expression.
CMOV
TODO...
LEA
lea($reg, $src, $bits)
-
Dispatch to a variant of LEA based on argument types.
lea16_reg_mem($reg16, \@mem)
=itemlea32_reg_mem($reg32, \@mem)
=itemlea64_reg_mem($reg64, \@mem)
=itemlea16_reg_reg($reg16, $reg64)
=itemlea32_reg_reg($reg32, $reg64)
=itemlea64_reg_reg($reg64, $reg64)
Load the address of the 64-bit value stored at "memory location". It is essentially a shorthand for two memory load operations where the first is loading a pointer and the second is loading the value it points to.
ADD, ADC
The add## variants are the plain ADD instruction, for each bit width. The addcarry## variants are the ADC instruction that also adds the carry flag, useful for multi-word addition.
add($dst, $src, $bits)
add##_reg_reg($dest, $src)
add##_reg_mem($reg, \@mem)
add##_mem_reg(\@mem, $reg)
add##_reg_imm($reg, $const)
add##_mem_imm(\@mem, $const)
-
Returns $self, for chaining.
addcarry($dst, $src, $bits), adc($dst, $src, $bits)
addcarry##_reg(reg64, reg64)
addcarry##_mem(reg64, base_reg64, displacement, index_reg64, scale)
addcarry##_to_mem(reg64, base_reg64, displacement, index_reg64, scale)
addcarry##_const(reg64, const)
addcarry##_const_to_mem(const, base_reg64, displacement, index_reg64, scale)
Returns $self, for chaining.
sub
AND
and($dst, $src, $bits)
and##_reg_reg($dest, $src)
and##_reg_mem($reg, \@mem)
and##_mem_reg(\@mem, $reg)
and##_reg_imm($reg, $const)
and##_mem_imm(\@mem, $const)
OR
or($dst, $src, $bits)
or##_reg(reg64, reg64)
or##_mem(reg64, base_reg64, displacement, index_reg64, scale)
or##_to_mem(reg64, base_reg64, displacement, index_reg64, scale)
or##_const(reg64, const)
or##_const_to_mem(const, base_reg64, displacement, index_reg64, scale)
XOR
xor($dst, $src, $bits)
xor##_reg(reg64, reg64)
xor##_mem(reg64, base_reg64, displacement, index_reg64, scale)
xor##_to_mem(reg64, base_reg64, displacement, index_reg64, scale)
xor##_const(reg64, const)
xor##_const_to_mem(const, base_reg64, displacement, index_reg64, scale)
SHL
Shift left by a constant or the CL register. The shift is at most 63 bits for 64-bit register, or 31 bits otherwise.
shl($dst, $src, $bits)
shl##_reg_imm( $reg, $const )
shl##_mem_imm( \@mem, $const )
shl##_reg_cl( $reg )
shl##_mem_cl( \@mem )
SHR
Shift right by a constant or the CL register. The shift is at most 63 bits for 64-bit register, or 31 bits otherwise.
shr($dst, $src, $bits)
shr##_reg_imm( $reg, $const )
shr##_mem_imm( \@mem, $const )
shr##_reg_cl( $reg, 'cl' // undef )
shr##_mem_cl( \@mem, 'cl' // undef )
SAR
Shift "arithmetic" right by a constant or the CL register, and sign-extend the left-most bits. The shift is at most 63 bits for 64-bit register, or 31 bits otherwise.
sar($dst, $src, $bits)
sar##_reg_imm( $reg, $const )
sar##_mem_imm( \@mem, $const )
sar##_reg_cl( $reg, 'cl' // undef )
sar##_mem_cl( \@mem, 'cl' // undef )
BSWAP
Swap byte order on 32 or 64 bits.
- bswap64
- bswap32
- bswap16
-
(This is actually the XCHG instruction)
CMP
Like SUB, but don't modify any arguments, just update RFLAGS.
cmp($dst, $src, $bits)
cmp##_reg_reg($dest, $src)
cmp##_reg_mem($reg, \@mem)
-
Subtract mem (second args) from reg (first arg)
- cmp##_mem_reg(\@mem, $reg);
-
Subtract reg (first arg) from mem (second args)
- cmp##_reg_imm($reg, $const)
-
Subtract const from reg
- cmp##_mem_imm(\@mem, $const)
-
Subtract const from contents of mem address
TEST
Like AND, but don't modify any arguments, just update flags. Note that order of arguments does not matter, and there is no "to_mem" variant.
test($dst, $src, $bits)
test##_reg_reg($dest, $src)
test##_reg_mem($reg, \@mem)
test##_reg_imm($reg, $const)
test##_mem_imm(\@mem, $const)
DEC
INC
NOT
Flip all bits in a target register or memory location.
NEG
Replace target register or memory location with signed negation (2's complement).
DIV, IDIV
div##_reg($reg)
-
Unsigned divide of _DX:_AX by a NN-bit register. (divides AX into AL,AH for 8-bit)
div##_mem(\@mem)
-
Unsigned divide of _DX:_AX by a NN-bit memory value referenced by 64-bit registers
div##_reg($reg)
-
Signed divide of _DX:_AX by a NN-bit register. (divides AX into AL,AH for 8-bit)
div##_mem(\@mem)
-
Signed divide of _DX:_AX by a NN-bit memory value referenced by 64-bit registers
MUL
- mul64_dxax_reg
- mul32_dxax_reg
- mul16_dxax_reg
- mul8_ax_reg
sign extend
Various special-purpose sign extension instructions, mostly used to set up for DIV
- sign_extend_al_ax, cbw
- sign_extend_ax_eax, cwde
- sign_extend_eax_rax, cdqe
- sign_extend_ax_dx, cwd
- sign_extend_eax_edx, cdq
- sign_extend_rax_rdx, cqo
flag modifiers
Each flag modifier takes an argument of 0 (clear), 1 (set), or -1 (invert).
PUSH
This only implements the 64-bit push instruction.
POP
ENTER
->enter( $bytes_for_vars, $nesting_level )
bytes_for_vars is an unsigned 16-bit, and nesting_level is a value 0..31 (byte masked to 5 bits)
Both constants may be expressions.
LEAVE
Un-do an ENTER instruction.
syscall
Syscall instruction, takes no arguments. (params are stored in pre-defined registers)
STRING INSTRUCTIONS
->xor('RAX','RAX') # Compare to 0
->mov('RCX', 42) # Count
->mov('RDI', \@memaddr) # String
->std # Iterate to increasing address
->repne->scas8; # Iterate until [RDI] == "\0" or 42 bytes
rep
Repeat RCX times (used with "ins", "lods", "movs", "outs", "stos")
repe, repz
Repeat RCX times or until zero-flag becomes zero. (used with "cmps", "scas")
repne, repnz
Repeat RCX times or until zero-flag becomes one. (used with "cmps", "scas")
flag_direction($bool_set)
Set (1) or clear (0) the direction flag.
std
Set the direction flag (iterate to higher address)
cld
Clear the direction flag (iterate to lower address)
movsNN
cmpsNN
scasNN
SYNCHRONIZATION INSTRUCTIONS
These special-purpose instructions relate to strict ordering of memory operations, cache flushing, or atomic operations useful for implementing semaphores.
compare_exchangeNN, cmpxchg
- compare_exchange64
- compare_exchange32
- compare_exchange16
- compare_exchange8
TODO
mfence, lfence, sfence
Parameterless instructions for memory access serialization. Forces memory operations before the fence to compete before memory operations after the fence. Lfence affects load operations, sfence affects store operations, and mfence affects both.
ENCODING x86_64 INSTRUCTIONS
The AMD64 Architecture Programmer's Manual is a somewhat tedious read, so here are my notes:
Typical 2-arg 64-bit instruction: REX ( AddrSize ) Opcode ModRM ( ScaleIndexBase ( Disp ) ) ( Immed )
REX: use extended registers and/or 64-bit operand sizes.
Not used for simple push/pop or handful of others
REX = 0x40 + (W:1bit R:1bit X:1bit B:1bit)
REX.W = "wide" (64-bit operand size when set)
REX.R is 4th bit of ModRM.Reg
REX.X is 4th bit of SIB.Index
REX.B is 4th bit of ModRM.R/M or of SIB.Base or of ModRM.Reg depending on goofy rules
ModRM: mode/registers flags
ModRM = (Mod:2bit Reg:3bit R/M:3bit)
ModRM.Mod indicates operands:
11b means ( Reg, R/M-reg-value )
00b means ( Reg, R/M-reg-addr ) unless second reg is SP/BP/R12/R13
01b means ( Reg, R/M-reg-addr + 8-bit disp ) unless second reg is SP/R12
10b means ( Reg, R/M-reg-addr + 32-bit disp ) unless second reg is SP/R12
When accessing mem, R/M=100b means include the SIB byte for exotic addressing options
In the 00b case, R/M=101b means use instruction pointer + 32-bit immed
SIB: optional byte for wild and crazy memory addressing; activate with ModRM.R/M = 0100b
SIB = (Scale:2bit Index:3bit Base:3bit)
address is (index_register << scale) + base_register (+immed per the ModRM.Mod bits)
* unless index_register = 0100b then no register is used.
(i.e. RSP cannot be used as an index register )
* unless base_register = _101b and ModRM.mod = 00 then no register is used.
(i.e. [R{BP,13} + R?? * 2] must be written as [R{BP,13} + R?? * 2 + 0]
The methods that perform the encoding are not public, but are documented in the source for anyone who wants to extend this module to handle additional instructions.
AUTHOR
Michael Conrad <mike@nrdvana.net>
COPYRIGHT AND LICENSE
This software is copyright (c) 2023 by Michael Conrad.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.