NAME
Statistics::R::IO::REXPFactory - Functions for parsing R data files
VERSION
version 0.07
SYNOPSIS
use Statistics::R::IO::REXPFactory qw( unserialize );
# Assume $data was created by reading, say, an RDS file
my ($rexp, $state) = @{unserialize($data)}
or die "couldn't parse";
# If we're reading an RDS file, there should be no data left
# unparsed
die 'Unread data remaining in the RDS file' unless $state->eof;
# the result of the unserialization is a REXP
say $rexp;
# REXPs can be converted to the closest native Perl data type
print $rexp->to_pl;
DESCRIPTION
This module implements the actual reading of serialized R objects and their conversion to a Statistics::R::REXP. You are not expected to use it directly, as it's normally wrapped by "readRDS" in Statistics::R::IO and "readRData" in Statistics::R::IO.
SUBROUTINES
- unserialize $data
-
Constructs a Statistics::R::REXP object from its serialization in
$data
. Returns a pair of the object and the Statistics::R::IO::ParserState at the end of serialization. - intsxp, langsxp, lglsxp, listsxp, rawsxp, realsxp, refsxp, strsxp, symsxp, vecsxp, envsxp, charsxp
-
Parsers for the corresponding R SEXP-types.
- object_content
-
Parses object info and its data by sequencing "unpack_object_info" and "object_data".
- unpack_object_info
-
Parser for serialized object info structure. Returns a hash with keys "is_object", "has_attributes", "has_tag", "object_type", and "levels", each corresponding to the field in R serialization described in http://cran.r-project.org/doc/manuals/r-release/R-ints.html#Serialization-Formats. An additional key "flags" contains the full 32-bit value as stored in the file.
- object_data $obj_info
-
Parser for a serialized R object, using the object type stored in
$obj_info
hash's "object_type" key to use the correct parser for the particular type. - vector_and_attributes $object_info, $element_parser, $rexp_class
-
Convenience parser for vectors, which are serialized first with a SEXP for the vector elements, followed by attributes stored as a tagged pairlist. Attributes are stored only if
$object_info
indicates their presence, while vector elements are parsed using$element_parser
. Finally, the parsed attributes and elements are used as arguments to the constructor of the$rexp_class
, which should be a subclass of Statistics::R::REXP::Vector. - header
-
Parser for header of R serialization: the serialization format (XDR, binary, etc.), the version number of the serialization (currently 2), and two 32-bit integers indicating the version of R which wrote the file followed by the minimal version of R needed to read the format.
- xdr, bin
-
Parsers for RDS header indicating files in XDR or native-binary format.
- maybe_long_length
-
Parser for vector length, allowing for the encoding of 64-bit long vectors introduced in R 3.0.
- tagged_pairlist_to_rexp_hash
-
Converts a pairlist to a REXP hash whose keys are the pairlist's element tags and values the pairlist elements themselves.
- tagged_pairlist_to_attribute_hash
-
Converts object attributes, which are serialized as a pairlist with attribute name in the element's tag, to a hash that can be used as the
attributes
argument to Statistics::R::REXP constructors.Some attributes are serialized using a compact encoding (for instance, when a table's row names are just integers 1:nrows), and this function will decode them to a complete REXP.
BUGS AND LIMITATIONS
There are no known bugs in this module. Please see Statistics::R::IO for bug reporting.
SUPPORT
See Statistics::R::IO for support and contact information.
AUTHOR
Davor Cubranic <cubranic@stat.ubc.ca>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2014 by University of British Columbia.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007