NAME
Microarray::GEO::SOFT::GDS - GEO data set data class
SYNOPSIS
use Microarray::GEO::SOFT:
my $soft = Microarray::GEO::SOFT->new;
$soft->download("GDS3719");
my $gds = $soft->parse;
# the meta information
$gds->meta;
$gds->platform;
$gds->title;
$gds->accession;
# the sample data is a matrix
$gds->matrix;
# the names for each column
$gds->colnames;
$ the names for each row, it is the primary id for rows
$gds->rownames;
DESCRIPTION
A DataSet represents a curated collection of biologically and statistically comparable GEO Samples and forms the basis of GEO's suite of data display and analysis tools. Samples within a DataSet refer to the same Platform, that is, they share a common set of array elements. Value measurements for each Sample within a DataSet are assumed to be calculated in an equivalent manner, that is, considerations such as background processing and normalization are consistent across the DataSet. Information reflecting experimental factors is provided through DataSet subsets. (Copyed from GEO web site).
This module retrieves data storing as GEO data set format. We take this as the basic microarray data format (expression matrix).
Subroutines
new("file" => $file, "use_identifier" => 0, "verbose" =
1)>-
Initial a GDS class object. The first argument is the path of the microarray data in SOFT format or a file handle that has been openned. The argument is optional and the platform can be download through Microarray::GEO::SOFT. Since gene identifiers have been integrated into the SOFT file, so user can shoose whether to take probe ID or identifiers as the primary ID. We do not accommendate to set 'use_identifier' to TURE becaure 'id_convert' will not work if set the value to TURE. 'verbose' determines whether print the message when analysis. 'sample_value_column' is the column name for table data when parsing GSM data.
$gds->parse
-
Retrieve data set information from microarray data. The data set in SOFT format is alawys a table
$gds->meta
-
Get meta information
$gds->set_meta(HASH)
-
Set meta information. Valid argumetns are 'accession', 'title' and 'platform'.
$gds->table
-
Get table information
$gds->set_table
-
Set table information. Valid argumetns are 'rownames', 'colnames', 'colname_explain' and 'matrix'.
$gds->platform
-
Accession number for the platform the data set belong to.
$gds->title
-
Title of the data set record
$gds->accession
-
Accession number for the data set
$gds->rownames
-
primary ID for probes
$gds->colnames
-
Different sample names or experiment designs
$gds->colnames_explain
-
A little more detailed explain for column names
$gds->matrix
-
expression value matrix
$gds->id_convert($gpl, $to_id)
-
Transfrom the primary ID to a new ID type. The first argument is a Microarray::GEO::SOFT::GPL class object that the GDS belongs to. The second argument is the ID that would map to. It is one of the colnames of
$gpl
. Also a regexp is accepted. It returns a Microarray::ExprSet object. $gds->soft2exprset
-
Transform Microarray::GEO::SOFT::GDS class object to Microarray::ExprSet class object.
$gds->get_subset(HASH)
-
Get subset of rows and columns in the expression matrix. Valid arguments are 'byrow' and 'bycol'. the value for these two arguments should be array reference where the length should be equal to the length of rownames or colnames of the matrix respectively. The value in the array should be either TRUE(1) or FALSE(0) to indicate whether take or drop the corresponding position in the matrix. It returns a Microarray::ExprSet object.
AUTHOR
Zuguang Gu <jokergoo@gmail.com>
COPYRIGHT AND LICENSE
Copyright 2012 by Zuguang Gu
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.12.1 or, at your option, any later version of Perl 5 you may have available.