NAME
Disassemble::X86::FormatTree - Format machine instructions as a tree
SYNOPSIS
use Disassemble::X86;
$d = Disassemble::X86->new(format => "Tree");
DESCRIPTION
This module returns Intel x86 machine instructions as a tree structure, which is suitable for further processing.
The tree consists of hashrefs. There are three common keys, though only op
is required:
- op
-
The operation being performed.
- size
-
The size of the result of the operation, in bits.
- arg
-
The arguments being operated on, in a listref. Each argument is represented by its own hashref.
Top-level nodes may also contain the following keys:
- start
-
The starting address of the instruction.
- len
-
The length of the instruction, in bytes.
- proc
-
The minimum processor model required, as described in Disassemble::X86.
- prefix
-
Set to 1 if this node is an opcode prefix such as
rep
orlock
.
The op
field commonly contains an opcode mnemonic. However, other values may appear.
- reg
-
A machine register.
- lit
-
A literal numeric value.
- mem
-
A reference to memory.
- seg
-
A segment prefix.
The argument list for a register contains the register name followed by its type. Register types include dword
and word
for general-purpose registers, seg
for segment registers, and fp
for floating-point registers. If the register is really part of a larger register, that register's name appears as a third arg.
That's quite a bit to digest all at once. Here is a simple example:
mov eax,0x1
becomes
{op=>"mov", arg=>[
{op=>"reg", size=>32, arg=>["eax", "dword"]},
{op=>"lit", size=>32, arg=>[0x1]}
], start=>1234, len=>5, proc=>386}
That's fairly straightforward. Here's something a bit more involved.
add byte[di+0x4],al
becomes
{op=>"add", arg=>[
{op=>"mem", size=>8, arg=>[
{op=>"+", size=>16, arg=> [
{op=>"reg", size=>16, arg=>["di", "word", "edi"]},
{op=>"lit", size=>16, arg=>[0x4]}
]}
]}
{op=>"reg", size=>8, arg=>["al", "lobyte", "eax"]}
], start=>5678, len=>3, proc=>86}
Notice that the details of the address calculation are encapsulated within the +
node. The address is 16 bits long, but the value fetched from memory is only 8 bits. This distinction is captured cleanly.
Yes, this is fairly complicated to work with. If you don't need all this complexity, try the FormatText module instead.
METHODS
format_instr
$tree = Disassemble::X86::Tree->format_instr($tree);
The format subroutine is a no-op. It returns exactly the same input it is given.
SEE ALSO
AUTHOR
Bob Mathews <bobmathews@alumni.calpoly.edu>
COPYRIGHT
Copyright (c) 2002 Bob Mathews. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.