NAME
Data::Frame - data frame implementation
VERSION
version 0.003
SYNOPSIS
use Data::Frame;
use PDL;
my $df = Data::Frame->new( columns => [
z => pdl(1, 2, 3, 4),
y => ( sequence(4) >= 2 ) ,
x => [ qw/foo bar baz quux/ ],
] );
say $df;
# ---------------
# z y x
# ---------------
# 0 1 0 foo
# 1 2 0 bar
# 2 3 1 baz
# 3 4 1 quux
# ---------------
say $df->nth_column(0);
# [1 2 3 4]
say $df->select_rows( 3,1 )
# ---------------
# z y x
# ---------------
# 3 4 1 quux
# 1 2 0 bar
# ---------------
DESCRIPTION
This implements a data frame container that uses PDL for individual columns. As such, it supports marking missing values (BAD
values).
The API is currently experimental and is made to work with Statistics::NiceR, so be aware that it could change.
METHODS
new
new( Hash %options ) # returns Data::Frame
Creates a new Data::Frame
when passed the following options as a specification of the columns to add:
columns => ArrayRef $columns_array
When
columns
is passed anArrayRef
of pairs of the form$columns_array = [ column_name_z => $column_01_data, # first column data column_name_y => $column_02_data, # second column data column_name_x => $column_03_data, # third column data ]
then the column data is added to the data frame in the order that the pairs appear in the
ArrayRef
.columns => HashRef $columns_hash
$columns_hash = { column_name_z => $column_03_data, # third column data column_name_y => $column_02_data, # second column data column_name_x => $column_01_data, # first column data }
then the column data is added to the data frame by the order of the keys in the
HashRef
(sorted with a stringwisecmp
).
string
string() # returns Str
Returns a string representation of the Data::Frame
.
number_of_columns
number_of_columns() # returns Int
Returns the count of the number of columns in the Data::Frame
.
number_of_rows
number_of_rows() # returns Int
Returns the count of the number of rows in the Data::Frame
.
nth_columm
number_of_rows(Int $n) # returns a column
Returns column number $n
. Supports negative indices (e.g., $n = -1 returns the last column).
column_names
column_names() # returns an ArrayRef
column_names( @new_column_names ) # returns an ArrayRef
Returns an ArrayRef
of the names of the columns.
If passed a list of arguments @new_column_names
, then the columns will be renamed to the elements of @new_column_names
. The length of the argument must match the number of columns in the Data::Frame
.
row_names
row_names() # returns a PDL
row_names( Array @new_row_names ) # returns a PDL
row_names( ArrayRef $new_row_names ) # returns a PDL
row_names( PDL $new_row_names ) # returns a PDL
Returns an ArrayRef
of the names of the columns.
If passed a argument, then the rows will be renamed. The length of the argument must match the number of rows in the Data::Frame
.
column
column( Str $column_name )
Returns the column with the name $column_name
.
add_columns
add_columns( Array @column_pairlist )
Adds all the columns in @column_pairlist
to the Data::Frame
.
add_column
add_column(Str $name, $data)
Adds a single column to the Data::Frame
with the name $name
and data $data
.
select_rows
select_rows( Array @which )
select_rows( ArrayRef $which )
select_rows( PDL $which )
The argument $which
is a vector of indices. select_rows
returns a new Data::Frame
that contains rows that match the indices in the vector $which
.
This Data::Frame
supports PDL's data flow, meaning that changes to the values in the child data frame columns will appear in the parent data frame.
If no indices are given, a Data::Frame
with no rows is returned.
SEE ALSO
AUTHOR
Zakariyya Mughal <zmughal@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Zakariyya Mughal.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.