Name
File::Replace - Perl extension for replacing files by renaming a temp file over the original
Synopsis
use File::Replace 'replace3';
my ($infh,$outfh,$repl) = replace3($filename);
while (<$infh>) {
# write whatever you like to $outfh here
print $outfh "X: $_";
}
$repl->finish; # closes the handles
Description
This module implements and hides the following pattern for you:
Open a temporary file for output
While reading from the original file, write output to the temporary file
rename
the temporary file over the original file
In many cases, in particular on many UNIX filesystems, the rename
operation is atomic*. This means that in such cases, the original filename will always exist, and will always point to either the new or the old version of the file, so a user attempting to open and read the file will always be able to do so, and never see an unfinished version of the file while it is being written.
* Warning: Unfortunately, whether or not a rename will actually be atomic in your specific circumstances is not always an easy question to answer, as it depends on exact details of the operating system and file system. Consult your system's documentation and search the Internet for "atomic rename" for more details. This module's job is to perform the rename
, and it can make no guarantees as to whether it will be atomic or not.
Important Notice
As of version 0.16, the distribution of this module has been split into several distributions: See the documentation of "replace", "replace2", and "inplace" for details.
Version
This documentation describes version 0.18 of this module.
Constructors and Overview
The constructors File::Replace->new()
, replace3()
, replace2()
, and replace()
take exactly the same arguments, and differ only in their return values - replace2
and replace
require you to install File::Replace::Fancy and wrap the functionality of File::Replace
inside tie
d filehandles.
Note that replace3()
, replace2()
, and replace()
are normal functions and not methods, don't attempt to call them as such. If you don't want to import them you can always call them as, for example, File::Replace::replace()
.
File::Replace->new( $filename );
File::Replace->new( $filename, $layers );
File::Replace->new( $filename, option => 'value', ... );
File::Replace->new( $filename, $layers, option => 'value', ... );
# replace3(...), replace2(...), and replace(...) take the same arguments
The constructors will open the input file and the temporary output file (the latter via File::Temp), and will die
in case of errors. The options are described in "Constructor Options". It is strongly recommended that you use warnings;
, as then this module will issue warnings which may be of interest to you.
new
use File::Replace;
my $replace_object = File::Replace->new($filename, ...);
Returns a new File::Replace
object. The central methods provided are ->in_fh
and ->out_fh
, which return the input resp. output filehandle which you can read resp. write, and ->finish
, which causes the files to be closed and the replace operation to be performed. There is also ->cancel
, which just discards the temporary output file without touching the input file. Additional helper methods are mentioned below.
finish
will die
on errors, while cancel
will only return a false value on errors. This module will try to clean up after itself (remove temporary files) as best it can, even when things go wrong.
Please don't re-open
the in_fh
and out_fh
handles, as this may lead to confusion.
The method ->is_open
will return a false value if the replace operation has been finish
ed or cancel
ed, or a true value if it is still active (note that this method does not check the state of the underlying filehandles). The method ->filename
returns the filename passed to the constructor. The method ->options
in list context returns the options this object has set (including defaults) as a list of key/value pairs, in scalar context it returns a hashref of these options.
replace3
This is a convenience function for shorter code:
use File::Replace 'replace3';
my ($in_fh,$out_fh,$repl_obj) = replace3($filename, ...);
is the same as
use File::Replace;
my $repl_obj = File::Replace->new($filename, ...);
my $in_fh = $repl_obj->in_fh;
my $out_fh = $repl_obj->out_fh;
replace2
Note: As of version 0.16, if you want to use this function, you must install File::Replace::Fancy!
use File::Replace 'replace2';
my ($infh,$outfh) = replace2($filename);
while (<$infh>) {
print $outfh "Y: $_";
}
close $infh; # closing both handles will
close $outfh; # trigger the replace
In list context, returns a two-element list of two tied filehandles, the first being the input filehandle, and the second the output filehandle, and the replace operation (finish
) is performed when both handles are close
d. In scalar context, it returns only the output filehandle, and the replace operation is performed when this handle is close
d. This means that close
may die
instead of just returning a false value.
You cannot re-open
these tied filehandles.
You can access the underlying File::Replace
object via tied(*$handle)->replace
on both the input and output handle. You can also access the original, untied filehandles via tied(*$handle)->in_fh
and tied(*$handle)->out_fh
, but please don't close
or re-open
these handles as this may lead to confusion.
replace
Note: As of version 0.16, if you want to use this function, you must install File::Replace::Fancy!
use File::Replace 'replace';
my $fh = replace($filename);
while (<$fh>) {
# can read _and_ write from/to $fh
print $fh "Z: $_";
}
close $fh;
Returns a single, "magical" tied filehandle. The operations print
, printf
, and syswrite
are passed through to the output filehandle, binmode
operates on both the input and output handle, and fileno
only reports -1
if the File::Replace
object is still active or undef
if the replace operation has finish
ed or been cancel
ed. All other I/O functions, such as <$handle>
, readline
, sysread
, seek
, tell
, eof
, etc. are passed through to the input handle. You can still access these operations on the output handle via e.g. eof( tied(*$handle)->out_fh )
or tied(*$handle)->out_fh->tell()
. The replace operation (finish
) is performed when you close
the handle, which means that close
may die
instead of just returning a false value.
Re-open
ing the handle causes a new underlying File::Replace
object to be created. You should explicitly close
the filehandle first so that the previous replace operation is performed (or cancel
that operation). The "mode" argument (or filename in the case of a two-argument open
) may not contain a read/write indicator (<
, >
, etc.), only PerlIO layers.
You can access the underlying File::Replace
object via tied(*$handle)->replace
. You can also access the original, untied filehandles via tied(*$handle)->in_fh
and tied(*$handle)->out_fh
, but please don't close
or re-open
these handles as this may lead to confusion.
inplace
Warning: As of version 0.16, if you want to use this function, you must install the experimental module File::Replace::Inplace.
This is a shorthand for the constructor of File::Replace::Inplace. That is:
use File::Replace qw/inplace/;
my $inplace = inplace(...);
is the same as
use File::Replace::Inplace;
my $inplace = File::Replace::Inplace->new(...);
As a special feature, if the import list contains a string beginning with -i
, then a global File::Replace::Inplace object will be set up, so ARGV
will be tied from the beginning of the script. Anything following the -i
will be used for the "backup" option. The purpose of this feature is to provide a replacement for Perl's -i
command-line switch in oneliners. For example, you can say:
perl -MFile::Replace=-i.bak -pe 's/foo/bar/g' file1.txt file2.txt
and those files will be edited in-place using this module. In addition, you may specify a -D
"switch" in the import list to enable debugging output, as in:
perl -MFile::Replace=-i,-D -pe 's/x/y/g' foo.txt bar.txt
The -D
switch currently only affects the "inplace" operations described here, but this may be expanded upon in the future to enable debugging everywhere.
Constructor Options
Filename
A filename. The temporary output file will be created in the same directory as this file, its name will be based on the original filename, but prefixed with a dot (.
) and suffixed with a random string and an extension of .tmp
. If the input file does not exist (ENOENT
), then the behavior will depend on the "create" option.
layers
This option can either be specified as the second argument to the constructors, or as the layers => '...'
option in the options hash, but not both. It is a list of PerlIO layers such as ":utf8"
, ":raw:crlf"
, or ":encoding(UTF-16)"
. Note that the default layers differ based on operating system, see "open" in perlfunc.
create
This option configures the behavior of the module when the input file does not exist (ENOENT
). There are three modes, which you specify as one of the following strings. If you need more precise control of the input file, see the "in_fh" option - note that create
is ignored when you use that option.
"later"
(default whencreate
omitted orundef
)-
Instead of the input file, /dev/null or its equivalent is opened. This means that while the output file is being written, the input file name will not exist, and only come into existence when the rename operation is performed.
"now"
-
If the input file does not exist, it is immediately created and opened. There is currently a potential race condition: if the file is created by another process before this module can create it, then the behavior is undefined - the file may be emptied of its contents, or you may be able to read its contents. This behavior may be fixed and specified in a future version. The race condition is discussed some more in "Concurrency and File Locking".
Currently, this option is implemented by opening the file with a mode of
+>
, meaning that it is created (clobbered) and opened in read-write mode. However, that should be considered an implementation detail that is subject to change. Do not attempt to take advantage of the read-write mode by writing to the input file - that contradicts the purpose of this module anyway. Instead, the input file will exist and remain empty until the replace operation. "off"
(or"no"
)-
Attempting to open a nonexistent input file will cause the constructor to
die
.
Previous versions of this module included support for other values of the create
option, as well as the devnull
option. These were replaced by the above create
options and deprecated in 0.06, and removed as of 0.08. Using unrecognized options will result in a fatal error. Note that in 0.06, specifying undef
for the create
option resulted in a deprecation warning, that behavior has now been changed so that undef
is equivalent to the create
option not being set.
backup
If you set this option to a non-empty string, then immediately after successfully opening the input file, it is copied to a file with the same name and the extension specified by this option (unless you use *
characters in the string, see below). For example, File::Replace->new("test.txt", backup=>".bak")
results in a copy of test.txt being made to test.txt.bak. If that file already exists or something goes wrong with the copy operation, then the constructor will die
.
As with Perl's -i
option, if the string contains *
characters, then instead of the string being appended to the filename, each *
character is replaced with the original filename. So for example, if you specify backup=>'orig_*'
, then the backup of test.txt will be orig_test.txt in the same path - unlike Perl's -i
option, this feature cannot be used to move files into a different directory.
Warning: If there is another process writing to the input file or creating files in the same directory as the input file, there is a potential for race conditions when using this option!
This option was introduced in version 0.10.
in_fh
This option allows you to pass an existing input filehandle to this module, instead of having the constructors open the input file for you. Use this option if you need more precise control over how the input file is opened, e.g. if you want to use sysopen
to open it. The handle must be open, which will be checked by calling fileno
on the handle. The module makes no attempt to check that the filename you pass to the module matches the filehandle. The module will attempt to stat
the handle to get its permissions, except when you have specified the "perms" option or disabled the "chmod" option. The "create" option is ignored when you use this option.
perms
perms => 0640 # ok
perms => oct("640") # ok
perms => "0640" # WRONG!
Normally, just before the rename
is performed, File::Replace
will chmod
the temporary file to those permissions that the original file had when it was opened, or, if the original file did not yet exist, default permissions based on the current umask
. Setting this option to an octal value (a number, not a string!) will override those permissions. See also "chmod", which can be used to disable the chmod
operation.
chmod
This option is enabled by default, unless you set $File::Replace::DISABLE_CHMOD
to a true value. When you disable this option, the chmod
operation that is normally performed just before the rename
will not be attempted. This is mostly intended for systems where you know the chmod
will fail. See also "perms", which allows you to define what permissions will be used.
Note that the temporary files created with File::Temp will have 0600 permissions if left unchanged (except of course on systems that don't support these kind of restrictive permissions).
autocancel
If the File::Replace
object is destroyed (e.g. when it goes out of scope), and the replace operation has not been performed yet, normally it will cancel
the replace operation and issue a warning. Enabling this option makes that implicit canceling explicit, silencing the warning.
This option cannot be used together with autofinish
.
autofinish
When set, causes the finish
operation to be attempted when the object is destroyed (e.g. when it goes out of scope).
However, using this option is actually not recommended unless you know what you are doing. This is because the replace operation will also be attempted when your script is die
ing, in which case the output file may be incomplete, and you may not want the original file to be replaced. A second reason is that the replace operation may be attempted during global destruction, and it is not a good idea to rely on this always going well. In general it is better to finish
the replace operation explicitly.
This option cannot be used together with autocancel
.
debug
If set to a true value, this option enables some debug output for new
, finish
, and cancel
. You may also set this to a filehandle, and debug output will be sent there.
Additional Methods
copy
This method copies a certain number of "characters" from the input handle to the output handle, that is, the temporary file. Depending on the status of the filehandle, either (8-bit) bytes or characters are read, see "read" in perlfunc. The option bufsize
lets you adjust the read buffer size, and the option less=>'ignore'
or less=>'ok'
suppresses the warning that less characters than you requested could be read. The method returns the number of characters copied and dies on errors.
use File::Replace;
my $repl = File::Replace->new($filename, ...);
$repl->copy(8); # copy eight characters
$repl->copy(1024, bufsize=>256); # copy 1024 chars, 256 at a time
$repl->copy(2048, less=>'ok'); # copy 2048, but don't warn if less
$repl->finish;
This method was added in version 0.08.
Notes and Caveats
Concurrency and File Locking
This module is very well suited for situations where a file has one writer and one or more readers.
Among other things, this is reflected in the case of a nonexistent file, where the "create" settings now
and later
(the default) are currently implemented as a two-step process, meaning there is the potential of the input file being created in the short period of time between the first and second open
attempts, which this module currently will not notice.
Having multiple writers is possible, but care must be taken to ensure proper coordination of the writers!
For example, a simple flock of the input file is not enough: if there are multiple processes, remember that each process will replace the original input file by a new and different file! One possible solution would be a separate lock file that does not change and is only used for flock
ing. There are other possible methods, but that is currently beyond the scope of this documentation.
(For the sake of completeness, note that you cannot flock
the tie
d handles, only the underlying filehandles.)
Author, Copyright, and License
Copyright (c) 2017-2023 Hauke Daempfling (haukex@zero-g.net) at the Leibniz Institute of Freshwater Ecology and Inland Fisheries (IGB), Berlin, Germany, http://www.igb-berlin.de/
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.