LICENSE
Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
CONTACT
Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.
Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.
NAME
Bio::EnsEMBL::Utils::IO
SYNOPSIS
use Bio::EnsEMBL::Utils::IO qw/slurp work_with_file slurp_to_array fh_to_array/;
#or
# use Bio::EnsEMBL::Utils::IO qw/:slurp/; # brings in any method starting with slurp
# use Bio::EnsEMBL::Utils::IO qw/:array/; # brings in any method which ends with _array
# use Bio::EnsEMBL::Utils::IO qw/:gz/; # brings all methods which start with gz_
# use Bio::EnsEMBL::Utils::IO qw/:bz/; # brings all methods which start with bz_
# use Bio::EnsEMBL::Utils::IO qw/:zip/; # brings all methods which start with zip_
# use Bio::EnsEMBL::Utils::IO qw/:all/; # brings all methods in
# As a scalar
my $file_contents = slurp('/my/file/location.txt');
print length($file_contents);
# As a ref
my $file_contents_ref = slurp('/my/file/location.txt', 1);
print length($$file_contents_ref);
# Sending it to an array
my $array = slurp_to_array('/my/location');
work_with_file('/my/location', 'r', sub {
$array = process_to_array($_[0], sub {
#Gives us input line by line
return "INPUT: $_";
});
});
# Simplified vesion but without the post processing
$array = fh_to_array($fh);
# Sending this back out to another file
work_with_file('/my/file/newlocation.txt', 'w', sub {
my ($fh) = @_;
print $fh $$file_contents_ref;
return;
});
# Gzipping the data to another file
gz_work_with_file('/my/file.gz', 'w', sub {
my ($fh) = @_;
print $fh $$file_contents_ref;
return;
});
# Working with a set of lines manually
work_with_file('/my/file', 'r', sub {
my ($fh) = @_;
iterate_lines($fh, sub {
my ($line) = @_;
print $line; #Send the line in the file back out
return;
});
return;
});
# Doing the same in one go
iterate_file('/my/file', sub {
my ($line) = @_;
print $line; #Send the line in the file back out
return;
});
# Move all data from one file handle to another. Bit like a copy
move_data($src_fh, $trg_fh);
DESCRIPTION
A collection of subroutines aimed to helping IO based operations
METHODS
See subroutines.
MAINTAINER
$Author$
VERSION
$Revision$
slurp()
Arg [1] : string $file
Arg [2] : boolean; $want_ref
Arg [3] : boolean; $binary
Indicates if we want to return a scalar reference
Description : Forces the contents of a file into a scalar. This is the
fastest way to get a file into memory in Perl. You can also
get a scalar reference back to avoid copying the file contents
in Scalar references. If the input file is binary then specify
with the binary flag
Returntype : Scalar or reference of the file contents depending on arg 2
Example : my $contents = slurp('/tmp/file.txt');
Exceptions : If the file did not exist or was not readable
Status : Stable
spurt()
Arg [1] : string $file
Arg [2] : string $contents
Arg [3] : boolean; $append
Arg [4] : boolean; $binary
Description : Convenient method to safely open a file and dump some content into it.
$append can be set to append to the file instead of resetting it first.
$binary can be set if the content you are printing is not plain-text.
Returntype : None
Example : spurt('/tmp/file.txt', $contents);
Exceptions : If the file could not be created or was not writable
Status : Stable
gz_slurp
Arg [1] : string $file
Arg [2] : boolean; $want_ref Indicates if we want to return a scalar reference
Arg [3] : boolean; $binary
Arg [4] : HashRef arguments to pass into IO compression layers
Description : Forces the contents of a file into a scalar. This is the
fastest way to get a file into memory in Perl. You can also
get a scalar reference back to avoid copying the file contents
in Scalar references. If the input file is binary then specify
with the binary flag
Returntype : Scalar or reference of the file contents depending on arg 2
Example : my $contents = slurp('/tmp/file.txt.gz');
Exceptions : If the file did not exist or was not readable
Status : Stable
bz_slurp
Arg [1] : string $file
Arg [2] : boolean; $want_ref Indicates if we want to return a scalar reference
Arg [3] : boolean; $binary
Arg [4] : HashRef arguments to pass into IO compression layers
Description : Forces the contents of a file into a scalar. This is the
fastest way to get a file into memory in Perl. You can also
get a scalar reference back to avoid copying the file contents
in Scalar references. If the input file is binary then specify
with the binary flag
Returntype : Scalar or reference of the file contents depending on arg 2
Example : my $contents = slurp('/tmp/file.txt.bz2');
Exceptions : If the file did not exist or was not readable
Status : Stable
zip_slurp
Arg [1] : string $file
Arg [2] : boolean; $want_ref Indicates if we want to return a scalar reference
Arg [3] : boolean; $binary
Arg [4] : HashRef arguments to pass into IO compression layers
Description : Forces the contents of a file into a scalar. This is the
fastest way to get a file into memory in Perl. You can also
get a scalar reference back to avoid copying the file contents
in Scalar references. If the input file is binary then specify
with the binary flag
Returntype : Scalar or reference of the file contents depending on arg 2
Example : my $contents = slurp('/tmp/file.txt.zip');
Exceptions : If the file did not exist or was not readable
Status : Stable
slurp_to_array
Arg [1] : string $file
Arg [2] : boolean $chomp
Description : Sends the contents of the given file into an ArrayRef
Returntype : ArrayRef
Example : my $contents_array = slurp_to_array('/tmp/file.txt');
Exceptions : If the file did not exist or was not readable
Status : Stable
gz_slurp_to_array
Arg [1] : string $file
Arg [2] : boolean $chomp
Arg [3] : HashRef arguments to pass into IO compression layers
Description : Sends the contents of the given gzipped file into an ArrayRef
Returntype : ArrayRef
Example : my $contents_array = gz_slurp_to_array('/tmp/file.txt.gz');
Exceptions : If the file did not exist or was not readable
Status : Stable
bz_slurp_to_array
Arg [1] : string $file
Arg [2] : boolean $chomp
Arg [3] : HashRef arguments to pass into IO compression layers
Description : Sends the contents of the given bzipped file into an ArrayRef
Returntype : ArrayRef
Example : my $contents_array = bz_slurp_to_array('/tmp/file.txt.bz2');
Exceptions : If the file did not exist or was not readable
Status : Stable
zip_slurp_to_array
Arg [1] : string $file
Arg [2] : boolean $chomp
Arg [3] : HashRef arguments to pass into IO compression layers
Description : Sends the contents of the given zipped file into an ArrayRef
Returntype : ArrayRef
Example : my $contents_array = zip_slurp_to_array('/tmp/file.txt.zip');
Exceptions : If the file did not exist or was not readable
Status : Stable
fh_to_array
Arg [1] : Glob/IO::Handle $fh
Arg [2] : boolean $chomp
Description : Sends the contents of the given filehandle into an ArrayRef.
Will perform chomp on each line if specified. If you require
any more advanced line based processing then see
L<process_to_array>.
Returntype : ArrayRef
Example : my $contents_array = fh_to_array($fh);
Exceptions : None
Status : Stable
process_to_array
Arg [1] : Glob/IO::Handle $fh
Arg [2] : CodeRef $callback
Description : Sends the contents of the given file handle into an ArrayRef
via the processing callback. Assumes line based input.
Returntype : ArrayRef
Example : my $array = process_to_array($fh, sub { return "INPUT: $_"; });
Exceptions : If the fh did not exist or if a callback was not given.
Status : Stable
iterate_lines
Arg [1] : Glob/IO::Handle $fh
Arg [2] : CodeRef $callback
Description : Iterates through each line from the given file handle and
hands them to the callback one by one
Returntype : None
Example : iterate_lines($fh, sub { print "INPUT: $_"; });
Exceptions : If the fh did not exist or if a callback was not given.
Status : Stable
iterate_file
Arg [1] : string $file
Arg [3] : CodeRef the callback which is used to iterate the lines in
the file
Description : Iterates through each line from the given file and
hands them to the callback one by one
Returntype : None
Example : iterate_file('/my/file', sub { print "INPUT: $_"; });
Exceptions : If the file did not exist or if a callback was not given.
Status : Stable
work_with_file
Arg [1] : string $file
Arg [2] : string; $mode
Supports all modes specified by the C<open()> function as well as those
supported by IO::File
Arg [3] : CodeRef the callback which is given the open file handle as
its only argument
Description : Performs the nitty gritty of checking if a file handle is open
and closing the resulting filehandle down.
Returntype : None
Example : work_with_file('/tmp/out.txt', 'w', sub {
my ($fh) = @_;
print $fh 'hello';
return;
});
Exceptions : If we could not work with the file due to permissions
Status : Stable
gz_work_with_file
Arg [1] : string $file
Arg [2] : string; $mode
Supports modes like C<r>, C<w>, C<\>> and C<\<>
Arg [3] : CodeRef the callback which is given the open file handle as
its only argument
Arg [4] : HashRef used to pass options into the IO
compression/uncompression modules
Description : Performs the nitty gritty of checking if a file handle is open
and closing the resulting filehandle down.
Returntype : None
Example : gz_work_with_file('/tmp/out.txt.gz', 'w', sub {
my ($fh) = @_;
print $fh 'hello';
return;
});
Exceptions : If we could not work with the file due to permissions
Status : Stable
bz_work_with_file
Arg [1] : string $file
Arg [2] : string; $mode
Supports modes like C<r>, C<w>, C<\>> and C<\<>
Arg [3] : CodeRef the callback which is given the open file handle as
its only argument
Arg [4] : HashRef used to pass options into the IO
compression/uncompression modules
Description : Performs the nitty gritty of checking if a file handle is open
and closing the resulting filehandle down.
Returntype : None
Example : bz_work_with_file('/tmp/out.txt.bz2', 'w', sub {
my ($fh) = @_;
print $fh 'hello';
return;
});
Exceptions : If we could not work with the file due to permissions
Status : Stable
zip_work_with_file
Arg [1] : string $file
Arg [2] : string; $mode
Supports modes like C<r>, C<w>, C<\>> and C<\<>
Arg [3] : CodeRef the callback which is given the open file handle as
its only argument
Arg [4] : HashRef used to pass options into the IO
compression/uncompression modules
Description : Performs the nitty gritty of checking if a file handle is open
and closing the resulting filehandle down.
Returntype : None
Example : zip_work_with_file('/tmp/out.txt.zip', 'w', sub {
my ($fh) = @_;
print $fh 'hello';
return;
});
Exceptions : If we could not work with the file due to permissions
Status : Stable
filter_dir
Arg [1] : String; directory
Arg [2] : CodeRef; the callback which is given a file in the
directory as its only argument
Description : Return the lexicographically sorted content of a directory.
The callback allows to specify the criteria an entry in
the directory must satisfy in order to appear in the content.
Returntype : Arrayref; list with the filtered files/directory
Example : filter_dir('/tmp', sub {
my $file = shift;
# select perl scripts in the directory
return $file if $file =~ /\.pl$/;
});
Exceptions : If the directory cannot be opened or its handle
cannot be closed
Status : Stable
move_data
Arg [1] : FileHandle $src_fh
Arg [2] : FileHandle $trg_fh
Arg [3] : int $buffer. Defaults to 8KB
Description : Moves data from the given source filehandle to the target one
using a 8KB buffer or user specified buffer
Returntype : None
Example : move_data($src_fh, $trg_fh, 16*1024); # copy in 16KB chunks
Exceptions : If inputs were not as expected