NAME
PDL::IO::Misc - misc IO routines for PDL
DESCRIPTION
Some basic I/O functionality: FITS, tables, byte-swapping
SYNOPSIS
use
PDL::IO::Misc;
FUNCTIONS
bswap2
Signature: (x(); )
Swaps pairs of bytes in argument x()
bswap2 does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.
bswap4
Signature: (x(); )
Swaps quads of bytes in argument x()
bswap4 does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.
bswap8
Signature: (x(); )
Swaps octets of bytes in argument x()
bswap8 does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.
rcols
Read specified ASCII cols from a file into piddles and perl arrays (also see "rgrep").
Usage:
(
$x
,
$y
,...) = rcols(
*HANDLE
|
"filename"
, {
EXCLUDE
=>
'/^!/'
},
$col1
,
$col2
, ... )
$x
= rcols(
*HANDLE
|
"filename"
, {
EXCLUDE
=>
'/^!/'
}, [] )
(
$x
,
$y
,...) = rcols(
*HANDLE
|
"filename"
,
$col1
,
$col2
, ..., {
EXCLUDE
=>
'/^!/'
} )
(
$x
,
$y
,...) = rcols(
*HANDLE
|
"filename"
,
"/foo/"
,
$col1
,
$col2
, ... )
For each column number specified, a 1D output PDL will be generated. Anonymous arrays of column numbers generate 2D output piddles with dim0 for the column data and dim1 equal to the number of colums in the anonymous array(s).
An empty anonymous array as column specification will produce a single output data piddle with dim(1) equal to the number of columns available.
There are two calling conventions - the old version, where a pattern can be specified after the filename/handle, and the new version where options are given as as hash reference. This reference can be given as either the second or last argument.
The default behaviour is to ignore lines beginning with a # character and lines that only consist of whitespace. Options exist to only read from lines that match, or do not match, supplied patterns, and to set the types of the created piddles.
Can take file name or *HANDLE, and if no explicit column numbers are specified, all are assumed. For the allowed types, see "Datatype_conversions" in PDL::Core.
Options (case insensitive):
EXCLUDE or IGNORE
- ignore lines matching this pattern (
default
B<
'/^#/'
>).
INCLUDE or KEEP
LINES
- a string pattern specifying which line numbers to
use
.
Line numbers start at 0 and the syntax is
'a:b:c'
to
use
every c
'th matching line between a and b (default B<'
'>).
DEFTYPE
stored in C<
$PDL::IO::Misc::deftype
>, which starts off as B<double>).
TYPES
- reference to an array of data types, one element
for
each
column
COLSEP
- splits on this string/pattern/
qr{}
between colums of data. Defaults to
$PDL::IO::Misc::defcolsep
.
PERLCOLS
- an array of column numbers which are to be
read
into perl arrays
rather than piddles. Any columns not specified in the explicit list
of columns to
read
will be returned
after
the explicit columns.
(
default
B<
undef
>).
COLIDS
-
if
defined
to an array reference, it will be assigned the column
ID
values
obtained by splitting the first line of the file in the
identical fashion to the column data.
CHUNKSIZE
- the number of input data elements to batch together
before
appending
to
each
output data piddle (Default value is 100). If CHUNKSIZE is
greater than the number of lines of data to
read
, the entire file is
slurped in, lines
split
, and perl lists of column data are generated.
At the end, effectively pdl(
@column_data
) produces any result piddles.
VERBOSE
- be verbose about IO processing (
default
C<
$PDL::vebose
>)
For example:
$x
= PDL->rcols
'file1'
;
# file1 has only one column of data
$x
= PDL->rcols
'file2'
, [];
# file2 can have multiple columns, still 1 piddle output
# (empty array ref spec means all possible data fields)
(
$x
,
$y
) = rcols
'table.csv'
, {
COLSEP
=>
','
};
# read CSV data file
(
$x
,
$y
) = rcols
*STDOUT
;
# default separator for lines like '32 24'
# read in lines containing the string foo, where the first
# example also ignores lines that begin with a # character.
(
$x
,
$y
,
$z
) = rcols
'file2'
, 0,4,5, {
INCLUDE
=>
'/foo/'
};
(
$x
,
$y
,
$z
) = rcols
'file2'
, 0,4,5, {
INCLUDE
=>
'/foo/'
,
EXCLUDE
=>
''
};
# ignore the first 27 lines of the file, reading in as ushort's
(
$x
,
$y
) = rcols
'file3'
, {
LINES
=>
'27:-1'
,
DEFTYPE
=> ushort };
(
$x
,
$y
) = rcols
'file3'
, {
LINES
=>
'27:'
,
TYPES
=> [ ushort, ushort ] };
# read in the first column as a perl array and the next two as piddles
# with the perl column returned after the piddle outputs
(
$x
,
$y
,
$name
) = rcols
'file4'
, 1, 2 , {
PERLCOLS
=> [ 0 ] };
printf
"Number of names read in = %d\n"
, 1 +
$#$name
;
# read in the first column as a perl array and the next two as piddles
# with PERLCOLS changing the type of the first returned value to perl list ref
(
$name
,
$x
,
$y
) = rcols
'file4'
, 0, 1, 2, {
PERLCOLS
=> [ 0 ] };
# read in the first column as a perl array returned first followed by the
# the next two data columns in the file as a single Nx2 piddle
(
$name
,
$xy
) = rcols
'file4'
, 0, [1, 2], {
PERLCOLS
=> [ 0 ] };
NOTES:
separator to specify an alternate
split
pattern or string or specify an
alternate
default
separator by setting C<
$PDL::IO::Misc::defcolsep
> .
column separator but C<
$PDL::IO::Misc::colsep
> is not
defined
by
default
.
If you set the variable to a
defined
value it will get picked up.
4.
LINES
=>
'-1:0:3'
may not work as you expect, since lines are skipped
when
read
in, then the whole array reversed.
5. For consistancy
with
wcols and rcols 1D usage, column data is loaded
into the rows of the pdls (i.e., dim(0) is the elements
read
per column
in the file and dim(1) is the number of columns of data
read
.
wcols
Write ASCII columns into file from 1D or 2D piddles and/or 1D listrefs efficiently.
Can take file name or *HANDLE, and if no file/filehandle is given defaults to STDOUT.
Options (case insensitive):
HEADER - prints this string
before
the data. If the string
is not terminated by a newline, one is added. (
default
B<
''
>).
COLSEP - prints this string between colums of data. Defaults to
$PDL::IO::Misc::defcolsep
.
FORMAT - A
printf
-style
format
string that is cycled through
column output
for
user controlled formatting.
Usage: wcols
$data1
,
$data2
,
$data3
,...,
*HANDLE
|
"outfile"
, [\
%options
];
# or
wcols
$format_string
,
$data1
,
$data2
,
$data3
,...,
*HANDLE
|
"outfile"
, [\
%options
];
where the
$dataN
args are either 1D piddles, 1D perl array refs,
or 2D piddles (as might be returned from rcols()
with
the [] column
syntax and/or using the PERLCOLS option). dim(0) of all piddles
written must be the same size. The
printf
-style
$format_string
,
if
given
, overrides a any FORMAT key settings in the option hash
e.g.,
$x
= random(4);
$y
= ones(4);
wcols
$x
,
$y
+2,
'foo.dat'
;
wcols
$x
,
$y
+2,
*STDERR
;
wcols
$x
,
$y
+2,
'|wc'
;
$a
= sequence(3);
$b
= zeros(3);
$c
= random(3);
wcols
$a
,
$b
,
$c
;
# Orthogonal version of 'print $a,$b,$c' :-)
wcols
"%10.3f"
,
$a
,
$b
;
# Formatted
wcols
"%10.3f %10.5g"
,
$a
,
$b
;
# Individual column formatting
$a
= sequence(3);
$b
= zeros(3);
$units
= [
'm/sec'
,
'kg'
,
'MPH'
];
wcols
$a
,
$b
, {
HEADER
=>
"# a b"
};
wcols
$a
,
$b
, {
Header
=>
"# a b"
,
Colsep
=>
', '
}; # case insensitive option names!
wcols
" %4.1f %4.1f %s"
,
$a
,
$b
,
$units
, {
header
=>
"# Day Time Units"
};
$a52
= sequence(5,2);
$b
= ones(5);
$c
= [ 1, 2, 4 ];
wcols
$a52
;
# now can write out 2D pdls (2 columns data in output)
wcols
$b
,
$a52
,
$c
# ...and mix and match with 1D listrefs as well
NOTES:
1. Columns are separated by whitespace by
default
,
use
C<
$PDL::IO::Misc::defcolsep
> to modify the
default
value or
the COLSEP option
2. Support
for
the C<
$PDL::IO::Misc::colsep
> global value
of PDL-2.4.6 and earlier is maintained but the initial value
of the global is
undef
until
you set it. The value will be
then be picked up and used as
if
defcolsep were specified.
3. Dim 0 corresponds to the column data dimension
for
both
rcols and wcols. This makes wcols the
reverse
operation
of rcols.
swcols
generate string list from sprintf
format specifier and a list of piddles
swcols
takes an (optional) format specifier of the printf sort and a list of 1D piddles as input. It returns a perl array (or array reference if called in scalar context) where each element of the array is the string generated by printing the corresponding element of the piddle(s) using the format specified. If no format is specified it uses the default print format.
Usage:
@str
= swcols
format
, pdl1,pdl2,pdl3,...;
or
$str
= swcols
format
, pdl1,pdl2,pdl3,...;
rgrep
Read columns into piddles using full regexp pattern matching.
Options:
UNDEFINED: This option determines what will be done
for
undefined
values
. For instance
when
reading a comma-separated file of the type
C<1,2,,4> where the C<,,> indicates a missing value.
The
default
value is to assign C<
$PDL::undefval
> to undefined
values
,
but
if
C<UNDEFINED> is set this is used instead. This would normally
be set to a number, but
if
it is set to C<Bad> and PDL is compiled
with
Badvalue support (see L<PDL::Bad/>) then undefined
values
are set to
the appropriate badvalue and the column is marked as bad.
DEFTYPE: Sets the
default
type of the columns - see the documentation
for
L</rcols()>
TYPES: A reference to a Perl array
with
types
for
each
column - see
the documentation
for
L</rcols()>
BUFFERSIZE: The number of lines to extend the piddle by. It might speed
up the reading a little bit by setting this to the number of lines in the
file, but in general L</rasc()> is a better choice
Usage
(
$x
,
$y
,...) = rgrep(
sub
,
*HANDLE
|
"filename"
)
e.g.
(
$a
,
$b
) = rgrep {/Foo (.*) Bar (.*) Mumble/}
$file
;
i.e. the vectors $a
and $b
get the progressive values of $1
, $2
etc.
rdsa
Read a FIGARO/NDF
format
file.
Requires non-PDL DSA module. Contact Frossie (frossie
@jach
.hawaii.edu)
Usage:
([
$xaxis
],
$data
) = rdsa(
$file
)
$a
= rdsa
'file.sdf'
Not yet tested with PDL-1.9X versions
isbigendian
Determine endianness of machine - returns 0 or 1 accordingly
rasc
Simple function to slurp in ASCII numbers quite quickly,
although error handling is marginal (to nonexistent).
$pdl
->rasc(
"filename"
|FILEHANDLE [,
$noElements
]);
Where:
filename is the name of the ASCII file to
read
or
open
file handle
$noElements
is the optional number of elements in the file to
read
.
(If not present, all of the file will be
read
to fill up
$pdl
).
$pdl
can be of type float or double (
for
more precision).
# (test.num is an ascii file with 20 numbers. One number per line.)
$in
= PDL->null;
$num
= 20;
$in
->rasc(
'test.num'
,20);
$imm
= zeroes(float,20,2);
$imm
->rasc(
'test.num'
);
rcube
Read list of files directly into a large data cube (
for
efficiency)
$cube
= rcube \
&reader_function
,
@files
;
$cube
= rcube \
&rfits
,
glob
(
"*.fits"
);
This IO function allows direct reading of files into a large data cube, Obviously one could use cat() but this is more memory efficient.
The reading function (e.g. rfits, readfraw) (passed as a reference) and files are the arguments.
The cube is created as the same X,Y dims and datatype as the first image specified. The Z dim is simply the number of images.
AUTHOR
Copyright (C) Karl Glazebrook 1997, Craig DeForest 2001, 2003, and Chris Marshall 2010. All rights reserved. There is no warranty. You are allowed to redistribute this software / documentation under certain conditions. For details, see the file COPYING in the PDL distribution. If this file is separated from the PDL distribution, the copyright notice should be included in the file.