NAME

CPU::x86_64::InstructionWriter - Assemble x86-64 instructions using a pure-perl API

VERSION

version 0.000_001

SYNOPSIS

# POSIX::exit(42);
my $machine_code= CPU::x86_64::InstructionWriter->new
  ->mov64_reg_imm( 'RAX', 60 )
  ->mov64_reg_imm( 'RDI', 42 )
  ->syscall()
  ->bytes;

# if (x == 1) { ++x } else { ++y }
my $machine_code= CPU::x86_64::InstructionWriter->new
  ->cmp64_reg_imm( 'RAX', 0 )
  ->jne('else')        # jump to not-yet-defined label
  ->inc64_reg( 'RAX' )
  ->jmp('end')         # jump to another not-yet-defined label
  ->mark('else')       # resolve previous jump to this address
  ->inc64_reg( 'RCX' )
  ->mark('end')        # resolve second jump to this address
  ->bytes;

DESCRIPTION

The purpose of this module is to relatively efficiently assemble instructions for the x86-64 without generating and re-parsing assembly language, or shelling out to an external tool. All instructions are assumed to be for the 64-bit mode of the processor. Functionality for real mode or segmented 16-bit mode will be handled by the yet-to-be-written x86 module.

This module consists of a bunch of chainable methods which build a string of machine code as you call them. It supports lazy-resolved jump labels, and lazy-bound constants which can be assigned a value after the instructions have been assembled.

Note: This module currently requires a perl with 64-bit integers and pack('Q') support.

get_label

my $label= $writer->get_label($name); # label-by-name, created on demand
my $label= $writer->get_label();      # new anonymous label

Return a label object for the given name, or if no name is given, return an anonymous label.

The label objects returned can be assigned a location within the instruction stream using "mark" and used as thetarget for JMP and JMP-like instructions. A label can also be used as a constant once all variable-length instructions have been "resolve"d and once "start_address" is defined.

mark

->mark($label_ref)  # bind label object to current position
->mark($undef_var)  # like above, but create anonymous label object and assign to $var
->mark($label_name) # like above, but create/lookup label object by name

Bind a named label to the current position in the instruction buffer. You can also pass a label reference from "get_label", or an undef variable which will be assigned a label.

If the current position follows instructions of unknown length, the label will be processed as an unknown, and shift automatically as the instructions are resolved.

bytes

Return the assembled instructions as a string of bytes. This will fail if any of the labels were left un-marked or if any expressions can't be evaluated.

INSTRUCTIONS

The following methods append an instruction to the buffer, and return $self so you can continue calling instructions in a chain.

NOP, PAUSE

Insert one or more no-op instructions.

nop(), nop( $n )

If called without an argument, insert one no-op. Else insert $n no-ops.

pause(), pause( $n )

Like NOP, but hints to the processor that the program is in a spin-loop so it has the opportunity to reduce power consumption. This is a 2-byte instruction.

CALL

call_label( $label )

Call to subroutine at named label, relative to current RIP. This method takes a label and calculates a call_rel( $ofs ) for you.

call_rel( $offset )

Call to subroutine at signed 32-bit offset from current RIP.

call_abs_reg( $reg )

Call to subroutine at absolute address stored in 64-bit register.

call_abs_mem( \@mem )

Call to subroutine at absolute address stored at "memory location"

RET

->ret
->ret($pop_bytes) # 16-bit number of bytes to discard from stack

JMP

All jump instructions are relative, and take either a numeric offset (from the start of the next instruction) or a label, except the jmp_abs instruction which takes a register containing the target address, and the jmp_from_addr which reads a memory address for the address to jump to.

jmp

Unconditional jump to label (or 32-bit offset constant).

jmp_abs_reg($reg)

Jump to the absolute address contained in a register.

jmp_abs_mem(\@mem)

Jump to the absolute address read from a "memory location"

jmp_if_eq, je, jz
jmp_if_ne, jne, jnz

Jump to label if zero flag is/isn't set after CMP instruction

jmp_if_unsigned_lt, jb, jmp_if_carry, jc
jmp_if_unsigned_gt, ja
jmp_if_unsigned_le, jbe
jmp_if_unsigned_ge, jae, jmp_unless_carry, jnc

Jump to label if unsigned less-than / greater-than / less-or-equal / greater-or-equal

jmp_if_signed_lt, jl
jmp_if_signed_gt, jg
jmp_if_signed_le, jle
jmp_if_signed_ge, jge

Jump to label if signed less-than / greater-than / less-or-equal / greater-or-equal

jmp_if_sign, js
jmp_unless_sign, jns

Jump to label if 'sign' flag is/isn't set after CMP instruction

jmp_if_overflow, jo
jmp_unless_overflow, jno

Jump to label if overflow flag is/isn't set after CMP instruction

jmp_if_parity_even, jpe, jp
jmp_if_parity_odd, jpo, jnp

Jump to label if 'parity' flag is/isn't set after CMP instruction

jmp_cx_zero, jrcxz

Short-jump to label if RCX register is zero

loop

Decrement RCX and short-jump to label if RCX register is nonzero (decrement of RCX does not change rFLAGS)

loopz, loope

Decrement RCX and short-jump to label if RCX register is nonzero and zero flag (ZF) is set. (decrement of RCX does not change rFLAGS)

loopnz, loopne

Decrement RCX and short-jump to label if RCX register is nonzero and zero flag (ZF) is not set (decrement of RCX does not change rFLAGS)

MOV

mov64_reg_reg($dest_reg, $src_reg)

Copy second register to first register. Copies full 64-bit value.

mov##_mem_reg($mem, $reg)

Store ##-bit value in register to a "memory location".

mov##_reg_mem($reg, $mem)

Load ##-bit value at "memory location" into register.

mov64_reg_imm($dest_reg, $constant)

Load a constant value into a 64-bit register. Constant is sign-extended to 64-bits. Constant may be an expression.

mov##_mem_imm($mem, $constant)

Store a constant value into a ##-bit memory location. For mov64, constant is sign-extended to 64-bits. Constant may be an expression.

CMOV

TODO...

ADD, ADC

The add## variants are the plain ADD instruction, for each bit width. The addcarry## variants are the ADC instruction that also adds the carry flag, useful for multi-word addition.

add##_reg_reg($dest, $src)
add##_reg_mem($reg, \@mem)
add##_mem_reg(\@mem, $reg)
add##_reg_imm($reg, $const)
add##_mem_imm(\@mem, $const)
addcarry##_reg(reg64, reg64)
addcarry##_mem(reg64, base_reg64, displacement, index_reg64, scale)
addcarry##_to_mem(reg64, base_reg64, displacement, index_reg64, scale)
addcarry##_const(reg64, const)
addcarry##_const_to_mem(const, base_reg64, displacement, index_reg64, scale)

AND

and##_reg_reg($dest, $src)
and##_reg_mem($reg, \@mem)
and##_mem_reg(\@mem, $reg)
and##_reg_imm($reg, $const)
and##_mem_imm(\@mem, $const)

OR

or##_reg(reg64, reg64)
or##_mem(reg64, base_reg64, displacement, index_reg64, scale)
or##_to_mem(reg64, base_reg64, displacement, index_reg64, scale)
or##_const(reg64, const)
or##_const_to_mem(const, base_reg64, displacement, index_reg64, scale)

XOR

xor##_reg(reg64, reg64)
xor##_mem(reg64, base_reg64, displacement, index_reg64, scale)
xor##_to_mem(reg64, base_reg64, displacement, index_reg64, scale)
xor##_const(reg64, const)
xor##_const_to_mem(const, base_reg64, displacement, index_reg64, scale)

SHL

Shift left by a constant or the CL register. The shift is at most 63 bits for 64-bit register, or 31 bits otherwise.

shl##_reg_imm( $reg, $const )
shl##_mem_imm( \@mem, $const )
shl##_reg_cl( $reg )
shl##_mem_cl( \@mem )

SHR

Shift right by a constant or the CL register. The shift is at most 63 bits for 64-bit register, or 31 bits otherwise.

shr##_reg_imm( $reg, $const )
shr##_mem_imm( \@mem, $const )
shr##_reg_cl( $reg, 'cl' // undef )
shr##_mem_cl( \@mem, 'cl' // undef )

SAR

Shift "arithmetic" right by a constant or the CL register, and sign-extend the left-most bits. The shift is at most 63 bits for 64-bit register, or 31 bits otherwise.

sar##_reg_imm( $reg, $const )
sar##_mem_imm( \@mem, $const )
sar##_reg_cl( $reg, 'cl' // undef )
sar##_mem_cl( \@mem, 'cl' // undef )

BSWAP

Swap byte order on 32 or 64 bits.

bswap64
bswap32
bswap16

(This is actually the XCHG instruction)

CMP

Like SUB, but don't modify any arguments, just update RFLAGS.

cmp##_reg_reg($dest, $src)
cmp##_reg_mem($reg, \@mem)

Subtract mem (second args) from reg (first arg)

cmp##_mem_reg(\@mem, $reg);

Subtract reg (first arg) from mem (second args)

cmp##_reg_imm($reg, $const)

Subtract const from reg

cmp##_mem_imm(\@mem, $const)

Subtract const from contents of mem address

TEST

Like AND, but don't modify any arguments, just update flags. Note that order of arguments does not matter, and there is no "to_mem" variant.

test##_reg_reg($dest, $src)
test##_reg_mem($reg, \@mem)
test##_reg_imm($reg, $const)
test##_mem_imm(\@mem, $const)

DEC

dec##_reg($reg)
dec##_mem(\@mem)

INC

inc##_reg($reg)
inc##_mem(\@mem)

NOT

Flip all bits in a target register or memory location.

notNN_reg($reg)
notNN_mem(\@mem)

NEG

Replace target register or memory location with signed negation (2's complement).

neg##_reg($reg)
neg##_mem(\@mem)

DIV, IDIV

divNNu_reg($reg)

Unsigned divide of _DX:_AX by a NN-bit register. (divides AX into AL,AH for 8-bit)

divNNu_mem(\@mem)

Unsigned divide of _DX:_AX by a NN-bit memory value referenced by 64-bit registers

divNNs_reg($reg)

Signed divide of _DX:_AX by a NN-bit register. (divides AX into AL,AH for 8-bit)

divNNs_mem(\@mem)

Signed divide of _DX:_AX by a NN-bit memory value referenced by 64-bit registers

MUL

mul64s_dxax_reg
mul32s_dxax_reg
mul16s_dxax_reg
mul8s_ax_reg
mul64s_reg
mul32s_reg
mul16s_reg
mul64s_mem
mul32s_mem
mul16s_mem
mul64s_const_reg
mul32s_const_reg
mul16s_const_reg
mul64s_const_mem
mul32s_const_mem
mul16s_const_mem

sign extend

Various special-purpose sign extension instructions, mostly used to set up for DIV

sign_extend_al_ax, cbw
sign_extend_ax_eax, cwde
sign_extend_eax_rax, cdqe
sign_extend_ax_dx, cwd
sign_extend_eax_edx, cdq
sign_extend_rax_rdx, cqo

flag modifiers

Each flag modifier takes an argument of 0 (clear), 1 (set), or -1 (invert).

flag_carry($state), clc, cmc, stc
flag_direction($state), cld, std

PUSH

This only implements the 64-bit push instruction.

push_reg
push_imm
push_mem

POP

pop_reg
pop_mem

ENTER

->enter( $bytes_for_vars, $nesting_level )

bytes_for_vars is an unsigned 16-bit, and nesting_level is a value 0..31 (byte masked to 5 bits)

Both constants may be expressions.

LEAVE

Un-do an ENTER instruction.

syscall

Syscall instruction, takes no arguments. (params are stored in pre-defined registers)

STRING INSTRUCTIONS

cmpNN_str

strcmp64, cmpsq
strcmp32, cmpsd
strcmp16, cmpsw
strcmp8, cmpsb

SYNCHRONIZATION INSTRUCTIONS

These special-purpose instructions relate to strict ordering of memory operations, cache flushing, or atomic operations useful for implementing semaphores.

compare_exchangeNN, cmpxchg

compare_exchange64
compare_exchange32
compare_exchange16
compare_exchange8

TODO

mfence, lfence, sfence

Parameterless instructions for memory access serialization. Forces memory operations before the fence to compete before memory operations after the fence. Lfence affects load operations, sfence affects store operations, and mfence affects both.

ENCODING x86_64 INSTRUCTIONS

The AMD64 Architecture Programmer's Manual is a somewhat tedious read, so here's my notes:

Typical 2-arg 64-bit instruction: REX ( AddrSize ) Opcode ModRM ( ScaleIndexBase ( Disp ) ) ( Immed )

	REX: use extended registers and/or 64-bit operand sizes.
		Not used for simple push/pop or handful of others
	REX = 0x40 + (W:1bit R:1bit X:1bit B:1bit)
		REX.W = "wide" (64-bit operand size when set)
		REX.R is 4th bit of ModRM.Reg
		REX.X is 4th bit of SIB.Index
		REX.B is 4th bit of ModRM.R/M or of SIB.Base or of ModRM.Reg depending on goofy rules
  
	ModRM: mode/registers flags
	ModRM = (Mod:2bit Reg:3bit R/M:3bit)
		ModRM.Mod indicates operands:
			11b means ( Reg, R/M-reg-value )
			00b means ( Reg, R/M-reg-addr ) unless second reg is SP/BP/R12/R13
			01b means ( Reg, R/M-reg-addr + 8-bit disp ) unless second reg is SP/R12
			10b means ( Reg, R/M-reg-addr + 32-bit disp ) unless second reg is SP/R12
			
			When accessing mem, R/M=100b means include the SIB byte for exotic addressing options
			In the 00b case, R/M=101b means use instruction pointer + 32-bit immed

	SIB: optional byte for wild and crazy memory addressing; activate with ModRM.R/M = 0100b
	SIB = (Scale:2bit Index:3bit Base:3bit)
		address is (index_register << scale) + base_register (+immed per the ModRM.Mod bits)
		* unless index_register = 0100b then no register is used.
			(i.e. RSP cannot be used as an index register )
		* unless base_register = _101b and ModRM.mod = 00 then no register is used.
			(i.e. [R{BP,13} + R?? * 2] must be written as [R{BP,13} + R?? * 2 + 0]

UTILITY METHODS FOR ENCODING INSTRUCTIONS

_encode_op_reg_reg

Encode standard instruction with REX prefix which refers only to registers. This skips all the memory addressing logic since it is only operating on registers, and always produces known-length encodings.

_append_op##_reg_mem

Encode standard ##-bit instruction with REX prefix which addresses memory for one of its operands. The encoded length might not be resolved until later if an unknown displacement value was given.

_append_mathopNN_const

This is so bizarre I don't even know where to start. Most "math-like" instructions have an opcode for an immediate the size of the register (except 64-bit which only gets a 32-bit immediate), an opcode for an 8-bit immediate, and another opcode specifically for the AX register which is a byte shorter than the normal, which is the only redeeming reason to bother using it. Also, there is a constant stored in the 3 bits of the unused register in the ModRM byte which acts as an extension of the opcode.

These 4 methods are the generic implementation for encoding this mess. Each implementation also handles the possibility that the immediate value is an unknown variable resolved while the instructions are assembled.

_append_mathop64_const($opcodeAX32, $opcode8, $opcode32, $opcode_reg, $reg, $immed)

This one is annoying because it only gets a sign-extended 32-bit value, so you actually only get 31 bits of an immediate value for a 64-bit instruction.

_append_mathop32_const($opcodeAX32, $opcode8, $opcode32, $opcode_reg, $reg, $immed)
_append_mathop16_const($opcodeAX16, $opcode8, $opcode16, $opcode_reg, $reg, $immed)
_append_mathop8_const($opcodeAX8, $opcode8, $opcode_reg, $reg, $immed)

On the upside, this one only has one bit width, so the length of the instruction is known even if the immediate value isn't.

However, we also have to handle the case where "dil", "sil", etc need a REX prefix but AH, BH, etc can't have one.

_append_shiftop_reg_imm( $bitwidth, $opcode_1, $opcode_imm, $opreg, $reg, $immed )

Shift instructions often have a special case for shifting by 1. This utility method selects that opcode if the immediate value is 1.

It also allows the immediate to be an expression, though I doubt that will ever happen... Immediate values are always a single byte, and the processor masks them to 0..63 so the upper bits are irrelevant.

_append_shiftop_mem_imm

Same as above, for memory locations

_encode_jmp_cond

Encodes a conditional jump instruction, which is either the short 2-byte form for 8-bit offsets, or 6 bytes for jumps of 32-bit offsets.

AUTHOR

Michael Conrad <mike@nrdvana.net>

COPYRIGHT AND LICENSE

This software is copyright (c) 2016 by Michael Conrad.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.