NAME

SMB::Parser - Convenient data parser for network protocols like SMB

SYNOPSIS

use SMB::Parser;

# Parse an imaginative packet of the following structure:
#   protocol signature (2 bytes in big-endian), header (48),
#   secret key (8), flags (1), mode (2 in little-endian),
#   payload offset (4) and length (4),
#   filename prefixed with length (2 + length),
#   padding to 4 bytes,
#   payload
# SMB::Packer documentation shows how it could be packed.

my $parser = SMB::Parser->new($packet_data_buffer);

die if $parser->uint16_be != 0xFACE;  # check signature
$parser->skip(48);                 # skip header (48 bytes)
my $body_start = $parser->offset;  # store offset (50 here)

my $secret = $parser->bytes(8);
my $flags  = $parser->uint8;
my $mode   = $parser->uint16;

my $payload_offset = $parser->uint32;
my $payload_length = $parser->uint32;

my $text_length = $parser->uint16;
my $filename = $parser->utf16($text_length);

$parser->align;  # redundant; mere jump using reset is enough
$parser->reset($body_start + $payload_offset);
my $payload = $parser->bytes($payload_length);

$parser->align;
my $unconsumed_buffer = $parser->bytes(
	bytes::length($packet_data_buffer) - $parser->offset);

DESCRIPTION

This class allows to parse a binary data, like a network packet data.

It supports extracting blobs, unsigned integers of different lengths, text in arbitrary encoding (SMB uses UTF-16LE) and more.

The current data pointer is usually between 0 and the data size. The managed data once set is never changed, so the data pointer may go over the data size if the caller is not cautious. This is different from SMB::Packer where the data is automatically extended in this case.

This class inherits from SMB, so msg, err, mem, dump, auto-created field accessor and other methods are available as well.

METHODS

new DATA

Class constructor. Returns an SMB::Parser instance and initializes its data with DATA and its pointer with 0 using set.

reset [OFFSET=0]

Resets the current data pointer.

Specifying OFFSET over the managed data size does not produce an error, but may likely cause all consequent parsing calls to return empty/null values with possible warnings, although the pointer consistently continues to advance.

set DATA [OFFSET=0]

Sets the object DATA (binary scalar) to be parsed and resets the pointer using reset.

cut DATA [OFFSET=<current-offset>]

Cuts data until the given OFFSET (by default until the current offset). This is useful to strip all processed data and have offset at 0.

If OFFSET is lesser than the current offset, then the current offset is adjusted correspondingly (reduced by OFFSET). If it is greater, then the data is still cut as requested and the current offset is reset to 0.

data

Returns the managed data (binary scalar).

size

Returns the managed data size.

offset

Returns the current data pointer (starts from 0).

align [START_OFFSET=0] [STEP=4]

Advances the pointer, if needed, until the next alignment point (that is every STEP bytes starting from START_OFFSET).

skip N_BYTES

Advances the pointer in N_BYTES (non-negative integer).

Returns the object, to allow chaining a consequent parsing method.

bytes N_BYTES

Normally returns the binary scalar of length N_BYTES starting from the current data pointer and advances the pointer.

On data overflow, less bytes than N_BYTES returned (and on consequent calls, 0 bytes returns). The data pointer is guaranteed to be advanced in N_BYTES, even on/after the overflow.

The following parsing methods use this method internally, so they share the same logic about reaching the end-of-data and advancing the pointer.

str N_BYTES [ENCODING='UTF-16LE']

Decodes N_BYTES (non-negative integer) as the text in the requested encoding starting from the current data pointer.

The returned string has the utf8 flag set if it is non-ASCII.

utf16 N_BYTES

The same as str with encoding 'UTF-16LE'.

utf16_be N_BYTES

The same as str with encoding 'UTF-16BE'.

uint8
uint16
uint32
uint16_be
uint32_be
uint64

Unpacks an unsigned integer of the specified length in bits (i.e. 1, 2, 4, 8 bytes).

By default, the byte order is little-endian (since it is used in SMB). The method suffix "_be" denotes the big-endian byte order for parsing.

fid1

Parses a file id used in SMB 1.

Returns an unsigned integer of 2 bytes.

fid2

Parses a file id used in SMB 2.

Returns an array ref of two unsigned integers of 8 bytes each.

SEE ALSO

SMB::Packer, SMB.

AUTHOR

Mikhael Goikhman <migo@cpan.org>