NAME
File::AtomicWrite - writes files atomically via rename()
SYNOPSIS
use File::AtomicWrite ();
# oneshot: requires filename and all the input data
# (as a filehandle or scalar ref)
File::AtomicWrite->write_file(
{ file => 'data.dat',
input => $filehandle,
}
);
# how paranoid are you?
File::AtomicWrite->write_file(
{ file => '/etc/passwd',
input => \$scalarref,
CHECKSUM => 1,
min_size => 100,
}
);
# instance interface: use to stream data or to have
# custom signal handlers
use Digest::SHA1;
my $aw = File::AtomicWrite->new(
{ file => 'name',
min_size => 1,
...
}
);
my $digest = Digest::SHA1->new;
my $tmp_fh = $aw->fh;
my $tmp_file = $aw->filename;
print $tmp_fh ...;
$digest->add(...);
$aw->checksum( $digest->hexdigest )->commit;
DESCRIPTION
This module offers atomic file writes via a temporary file created in the same directory (and therefore probably the same partition) as the specified file. After data has been written to the temporary file, the rename
system call is used to replace the target file. The module optionally supports various sanity checks (min_size, CHECKSUM) that help ensure the data is written without errors.
Should anything go awry, the module will die
or croak
. All calls should be wrapped in eval
blocks or better yet Try::Tiny.
eval { File::AtomicWrite->write_file(...) };
if ($@) { die "uh oh: $@" }
The module attempts to flush
and sync
the temporary filehandle prior to the rename
call. This may cause portability problems. If so, please let the author know. Also notify the author if false positives from the close
call are observed.
CLASS METHODS
- write_file options hash reference
-
Requires a hash reference that must contain both the input and file options. Performs the various required steps in a single method call. Only if all checks pass will the input data be moved to the file file via
rename
. If not, the module will throw an error and attempt to cleanup any temporary files created.See "OPTIONS" for additional settings that can be passed to
write_file
.write_file installs
local
signal handlers forINT
,TERM
, and__DIE__
to try to cleanup any active temporary files if the process is killed or dies. If these are a problem instead use the OO interface and setup signal handlers as necessary. - safe_level safe_level value
-
Method to customize the File::Temp module
safe_level
value. Consult the File::Temp documentation for more information on this option.Can also be set via the safe_level option.
- set_template File::Temp template
-
Method to customize the default File::Temp template used when creating temporary files. NOTE: if customized, the template must contain a sufficient number of
X
that suffix the template string, as otherwise File::Temp will throw an error:template => "mytmp.X", # Wrong template => "mytmp.XXXXXXXXXX", # better
Can also be set via the template option.
- new options hash reference
-
Takes most of the same options as
write_file
and returns an object, notably not input on the presumption that the temporary file or file handle will be used by other code to write the file. Sanity checks are deferred until the commit method is called. The checksum method call with a suitable argument is required for that verification to pass.If a rollback is required
undef
the File::AtomicWrite object; the object destructor should then unlink the temporary file. However, should the process receive a TERM, INT, or other signal that causes the script to exit the temporary file will not be cleaned up. If this is undesirable, a signal handler must be installed:my $aw = File::AtomicWrite->new({file => 'somefile'}); for my $sig_name (qw/INT TERM/) { $SIG{$sig_name} = sub { exit } } ...
Consult perlipc(1) for more information on signal handling, and the
eg/cleanup-test
program under this module distribution. A__DIE__
signal handler may also be necessary, consult thedie
perlfunc documentation for details.Instances must not be reused; create a new instance instead of calling new again on an existing instance. Reuse may cause undefined behavior or other unexpected problems.
INSTANCE METHODS
- fh
-
Returns the temporary filehandle.
- filename
-
Returns the file name of the temporary file.
- checksum SHA1 hexdigest
-
Takes a single argument that must contain the Digest::SHA1
hexdigest
of the data written to the temporary file. Enables the CHECKSUM option. - commit
-
Call this method once finished with the temporary file. A number of sanity checks (if enabled via the appropriate "OPTIONS") will be performed. If these pass, the temporary file will be renamed to the real filename.
No subsequent use of the instance should be made after calling this method as this would lead to undefined behavior.
OPTIONS
The write_file and new methods accept a number of options, supplied via a hash reference. Mandatory options:
- file => filename
-
A filename in the current working directory, or a path to the file that will (eventually) be created. By default, the temporary file will be written into the parent directory of the file path. This default can be changed by using the tmpdir option.
If the MKPATH option is true, the module will attempt to create any missing directories. If the MKPATH option is false or not set, the module will throw an error should any parent directories of the file not exist.
- input => scalar ref or filehandle
-
Mandatory for the write_file method, illegal for the new method. Scalar reference, or otherwise some filehandle reference that can be looped over via
readline
. Supplies the data to be written to file.
Optional options:
- backup => suffix
-
Make a backup with this (non-empty) suffix. The backup is always created, even if there was no change. If a previous backup existed, it is deleted first. Usual throwing of error.
- BINMODE => true or false
-
If true,
binmode
is set on the temporary filehandle prior to writing the input data to it. Default is not to setbinmode
. - binmode_layer => LAYER
-
Supply a
LAYER
argument tobinmode
. Enables BINMODE.# just binmode (binary data) ...->write_file({ ..., BINMODE => 1 }); # custom binmode layer ...->write_file({ ..., binmode_layer => ':utf8' });
- checksum => sha1 hexdigest
-
If this option exists, and CHECKSUM is true, the module will not create a Digest::SHA1
hexdigest
of the data being written out to disk, but instead will rely on the value passed by the caller.Only for the write_file interface; instead call the checksum method to supply a
hexdigest
checksum of the data written when using the instance interface; see the "SYNOPSIS" for an example of this. - CHECKSUM => true or false
-
If true, Digest::SHA1 will be used to checksum the data read back from the disk against the checksum derived from the data written out to the temporary file.
Use the checksum option (or checksum method) to supply a Digest::SHA1
hexdigest
checksum. This will spare the module the task of computing the checksum on the data being written.Only for the write_file interface.
- min_size => size
-
Specify a minimum size (in bytes) that the data written must exceed. If not, the module throws an error. (It was a process that wrote out a zero-sized
/etc/passwd
file that prompted the creation of this module.) - MKPATH => true or false
-
If true, attempt to create the parent directories of file should that directory not exist. If false (or unset), and the parent directory does not exist, the module throws an error. If the directory cannot be created, the module throws an error.
If true, this option will also attempt to create the tmpdir directory, if that option is set.
- mode => unix mode
-
Accepts a Unix mode for
chmod
to be applied to the file. Usual throwing of error. If the mode is a string starting with0
,oct
is used to convert it:my $orig_mode = (stat $source_file)[2] & 07777; ...->write_file({ ..., mode => $orig_mode }); my $mode = '0644'; ...->write_file({ ..., mode => $mode });
The module does not change
umask
, nor is there a means to specify the permissions on directories created if MKPATH is set. - mtime => mtime
-
Accepts
mtime
timestamp forutime
to be applied to the file. Usual throwing of error. - owner => unix ownership string
-
Accepts similar arguments to chown(1) to be applied via
chown
to the file. Usual throwing of error....->write_file({ ..., owner => '0' }); ...->write_file({ ..., owner => '0:0' }); ...->write_file({ ..., owner => 'user:somegroup' });
- safe_level => safe_level value
-
Optional means to set the File::Temp module
safe_level
value. Consult the File::Temp documentation for more information on this option.This value can also be set via the safe_level class method.
- template => File::Temp template
-
Template to supply to File::Temp. Defaults to a reasonable value if unset. NOTE: if customized, the template must contain a sufficient number of
X
that suffix the template string, as otherwise File::Temp will throw an error.Can also be set via the set_template class method.
- tmpdir => directory
-
If set to a directory, the temporary file will be written to this directory instead of by default to the parent directory of the target file. If the tmpdir is on a different partition than the parent directory for file, or if anything else goes awry, the module will throw an error: rename(2) does not operate across partition boundaries.
This option is advisable when writing files to include directories such as
/etc/logrotate.d
, as the programs that read include files from these directories may read even a temporary dot file while it is being written. To avoid this (slight but non-zero) risk, use the tmpdir option to write the configuration out in full under a different directory on the same partition.
BUGS
No known bugs (lots of potential issues, though, see below).
Reporting Bugs
http://github.com/thrig/File-AtomicWrite
Known Issues
See perlport for various portability problems possible with the rename
call. Consult rename(2) or equivalent for caveats. Note however that rename(2) is used heavily by common programs such as mv(1) and rsync
.
File hard links created by ln(1) will be broken by this module, as this module has no way of knowing whether any other files link to the inode of the file being operated on:
% touch afile
% ln afile afilehardlink
% ls -i afile*
3725607 afile 3725607 afilehardlink
% perl -MFile::AtomicWrite -e \
'File::AtomicWrite->write_file({file =>"afile",input=>\"foo"})'
% ls -i afile*
3725622 afile 3725607 afilehardlink
Union or bind mounts might also be a problem, if what is actually some other filesystem is present between the temporary and final file locations.
Some filesystems may also require a fsync call on a filehandle of the directory containing the file (see fsync(2) on RHEL, for example), to ensure that the directory data also reaches disk, in addition to the contents of the file. Certain filesystem options may also need to be set, such as data=journal
or data=ordered
on ext3, so that any crashes or unexpected glitches have less chance of unanticipated problems (such as the file write being ordered after the rename).
Renames may strip fancy ACL or selinux contexts.
SEE ALSO
Supporting modules:
Digest::SHA1, File::Basename, File::Path, File::Temp
This isn't easy:
http://danluu.com/file-consistency/
https://homes.cs.washington.edu/~lijl/papers/ferrite-asplos16.pdf
https://unix.stackexchange.com/questions/464382
AUTHOR
thrig - Jeremy Mates (cpan:JMATES) <jmates at cpan.org>
mtime
and other features contributed by Stijn De Weirdt.
COPYRIGHT
Copyright (C) 2009-2016,2018 Jeremy Mates
This program is distributed under the (Revised) BSD License: http://www.opensource.org/licenses/BSD-3-Clause