NAME
Disassemble::X86::FormatText - Format machine instructions as text
SYNOPSIS
use Disassemble::X86;
$d = Disassemble::X86->new(format => "Text");
DESCRIPTION
This module formats disassembled Intel x86 machine instructions as human-readable text. Output is in Intel assembler syntax, with a few minor exceptions, as described as below. Output is produced in lower case.
Certain conventions are used in order to make it easier for programs to process the output of the disassembler. This is useful when you don't want the complexity of working with the output of the FormatTree module. I find that these changes make the output more readable to humans as well.
Segment register override prefixes and address/operand size prefixes are incorporated into the argument list. In some cases, this is accomplished by using an "explicit operand" form of the instruction instead of the usual implicit form.
cs: xlatb becomes xlat byte[cs:ebx]
If other prefixes are present, they precede the opcode mnemonic separated by single space characters. If the instruction has any operands, they appear after another space, separated by commas. There is no whitespace between or within operands, so you can separate the parts of an instruction with split ' '
. In order to make this possible, the word "PTR" is omitted from memory operands.
mov 0x42, WORD PTR [edx] becomes mov 0x42,word[edx]
If one or more prefixes are present, but there are no operands, a single "." is added as an operand. This means you can always assume that the last component is an operand, if more than one component is present. The only case where this would normally occur is with string operations. However, this module always uses the explicit operand form for string ops.
rep movsb becomes rep movs byte[es:di],byte[si]
not rep movsb .
The memory operand size (byte, word, etc.) is usually included in the operand, even if it can be determined from context. That way, the size is not lost if later processing separates the operand from the rest of the instruction. (Some memory operands have no real size, though, while others have unusual sizes which are not shown.)
ADD eax,[0x1234] becomes add eax,dword[0x1234]
Unlike AT&T assembler syntax, individual operands never contain embedded commas. This means that you can safely break up the operand list with split/,/
.
lea 0x0(,%ebx,4),%edi becomes lea edi,[ebx*4+0x0]
METHODS
format_instr
$text = Disassemble::X86::Text->format_instr($tree);
Accepts a machine instruction in tree format, and converts it to text.
SEE ALSO
AUTHOR
Bob Mathews <bobmathews@alumni.calpoly.edu>
COPYRIGHT
Copyright (c) 2002 Bob Mathews. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.