NAME
Disassemble::X86 - Disassemble Intel x86 binary code
SYNOPSIS
use Disassemble::X86;
$d = Disassemble::X86->new(text => $text_seg);
while (defined( $op = $d->disasm() )) {
printf "%04x %s\n", $d->op_start(), $op;
}
DESCRIPTION
This module disassembles binary-coded Intel x86 machine instructions. Output can be produced as plain text, or as a tree structure suitable for further processing.
METHODS
new
$d = Disassemble::X86->new(
text => $text_seg,
start => $text_load_addr,
pos => $initial_eip,
addr_size => 32,
data_size => 32,
size => 32,
format => "Text",
);
Creates a new disassembler object. There are a number of named parameters which can be given, all of which are optional.
- text
-
The so-called text segment, which consists of the binary data to be disassembled. It can be given either as a string or as a
Disassemble::X86::MemRegion
object. - start
-
The address at which the text segment would be loaded to execute the program. This parameter is ignored if
text
is a MemRegion object, and defaults to 0 otherwise. - pos
-
The address at which disassembly is to begin, unless changed by
$d->pos()
. Default value is the start of the text segment. - addr_size
-
Gives the address size (16 or 32 bit) which will be used when disassembling the code. Default is 32 bits. See below.
- data_size
-
Gives the data operand size, similar to
addr_size
. - size
-
Sets both
addr_size
anddata_size
. - format
-
Gives the name of an output-formatting module, which will be used to process the disassembled instructions. Currently, valid values are
Text
andTree
. See Disassemble::X86::FormatText, Disassemble::X86::FormatTree.
disasm
$op = $d->disasm();
Disassembles a single machine instruction from the current position. Advances the current position to the next instruction. If no valid instruction is found at the current position, returns undef
and leaves the current position unchanged. In that case, you can check $d->error()
for more information.
addr_size
$d->addr_size(16);
Sets the address size for disassembled code. Valid values are 16, "word", 32, "dword", and "long", but some of these are synonyms. With no argument, returns the current address size as 16 or 32.
data_size
$d->data_size("long");
Similar to addr_size above, but sets the data operand size.
pos
$d->pos($new_pos);
Sets the current disassembly position. With no argument, returns the current position.
text
$text = $d->text();
Returns the text segment as a Disassemble::X86::MemRegion
object.
at_end
until ( $d->at_end() ) {
...
}
Returns true if the current disassembly position has reached the end of the text segment.
contains
if ( $d->contains($addr) ) {
...
}
Returns true if $addr
is within the memory region being disassembled.
next_byte
$byte = $d->next_byte();
Returns the next byte from the current disassembly position as an integer value, and advances the current position by one. This can be used to skip over invalid instructions that are encountered during disassembly. If the current position is not valid, returns 0, but still advances the current position. Attempting to read beyond the 15-byte opcode size limit will cause an error.
op
This and the following functions return information about the previously disassembled machine instruction. $d->op()
returns the instruction itself, in tree-structure format.
op_start
Returns the starting address of the instruction.
op_len
Returns the length of the instruction, in bytes.
op_proc
Returns the minimum processor model required. For instructions present in the original 8086 processor, the value 86 is returned. For instructions supported by the 8087 math coprocessor, the value is 87. Instructions initially introduced with the Pentium return 586, and so on. Note that setting the address or operand size to 32 bits requires at least a 386. Other possible return values are "mmx", "sse", "sse2", "3dnow", and "3dnow-e" (for extended 3DNow! instructions).
This information should be used carefully, because there may be subtle differences between different steppings of the same processor. In some cases, you must check the CPUID instruction to see exactly what your processor supports. When in doubt, consult the Intel Architecture Software Developer's Manual.
error
Returns the error message encountered while trying to disassemble an instruction.
LIMITATIONS
Multiple discontinuous text segments are not supported. Use additional Disassemble::X86
objects if you need them.
In some cases, this module will disassemble an opcode that would actually cause the processor to raise an illegal opcode exception. This may also be construed as a feature.
Some of the more exotic instructions like cache control and MMX extensions have not been thoroughly tested. Please let me know if you find something that is broken.
SEE ALSO
AUTHOR
Bob Mathews <bobmathews@alumni.calpoly.edu>
COPYRIGHT
Copyright (c) 2002 Bob Mathews. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.