NAME
Disassemble::X86::FormatTree - Format machine instructions as a tree
SYNOPSIS
use Disassemble::X86;
$d = Disassemble::X86->new(format => "Tree");
DESCRIPTION
This module returns Intel x86 machine instructions as a tree structure, which is suitable for further processing.
The tree consists of hashrefs. There are three common keys, though only op is required:
- op
 - 
The operation being performed.
 - size
 - 
The size of the result of the operation, in bits.
 - arg
 - 
The arguments being operated on, in a listref. Each argument is represented by its own hashref.
 
Top-level nodes may also contain the following keys:
- start
 - 
The starting address of the instruction.
 - len
 - 
The length of the instruction, in bytes.
 - proc
 - 
The minimum processor model required, as described in Disassemble::X86.
 - prefix
 - 
Set to 1 if this node is an opcode prefix such as
reporlock. 
The op field commonly contains an opcode mnemonic. However, other values may appear.
- reg
 - 
A machine register.
 - lit
 - 
A literal numeric value.
 - mem
 - 
A reference to memory.
 - seg
 - 
A segment prefix.
 
The argument list for a register contains the register name followed by its type. Register types include dword and word for general-purpose registers, seg for segment registers, and fp for floating-point registers. If the register is really part of a larger register, that register's name appears as a third arg.
That's quite a bit to digest all at once. Here is a simple example:
mov eax,0x1
    becomes
{op=>"mov", arg=>[
    {op=>"reg", size=>32, arg=>["eax", "dword"]},
    {op=>"lit", size=>32, arg=>[0x1]}
], start=>1234, len=>5, proc=>386}
That's fairly straightforward. Here's something a bit more involved.
add byte[di+0x4],al
    becomes
{op=>"add", arg=>[
    {op=>"mem", size=>8, arg=>[
        {op=>"+", size=>16, arg=> [
            {op=>"reg", size=>16, arg=>["di", "word", "edi"]},
            {op=>"lit", size=>16, arg=>[0x4]}
        ]}
    ]}
    {op=>"reg", size=>8, arg=>["al", "lobyte", "eax"]}
], start=>5678, len=>3, proc=>86}
Notice that the details of the address calculation are encapsulated within the + node. The address is 16 bits long, but the value fetched from memory is only 8 bits. This distinction is captured cleanly.
Yes, this is fairly complicated to work with. If you don't need all this complexity, try the FormatText module instead.
METHODS
format_instr
$tree = Disassemble::X86::Tree->format_instr($tree);
The format subroutine is a no-op. It returns exactly the same input it is given.
SEE ALSO
AUTHOR
Bob Mathews <bobmathews@alumni.calpoly.edu>
COPYRIGHT
Copyright (c) 2002 Bob Mathews. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.