NAME
Sys::Export::Unix - Export subsets of a UNIX system
SYNOPSIS
use Sys::Export::Unix;
my $exporter= Sys::Export::Unix->new(
src => '/', dst => '/initrd'
rewrite_paths => {
'sbin' => 'bin',
'usr/bin' => 'bin',
'usr/sbin' => 'bin',
'usr/lib' => 'lib',
},
);
$exporter->add('bin/busybox');
DESCRIPTION
This object contains the logic for exporting unix-style systems.
CONSTRUCTORS
new
Sys::Export::Unix->new(\%attributes); # hashref
Sys::Export::Unix->new(%attributes); # key/value list
Required attributes:
- src
-
The root of the system to export from (often '/', but you must specify this)
- dst
-
The root of the exported system. This directory must exist, and should be empty unless you specify 'on_conflict'.
It can also be an object with 'add' and 'finish' methods, which avoids the entire construction of a staging directory, and doesn't require root permission to operate.
Options:
- rewrite_path
-
Convenience for calling "rewrite_path" using a hashref of
{ src => dst }
pairs. - rewrite_user
-
Convenience for calling "rewrite_user" using a hashref of
{ src => dst }
pairs. - rewrite_group
-
Convenience for calling "rewrite_group" using a hashref of
{ src => dst }
pairs. - src_userdb
-
An instance of Sys::Export::Unix::UserDB, or constructor parameters for one. The default is to read
$src/etc/passwd
, or fall back to the getpwnam function of the host. See "USER REMAPPING" for more details. - dst_userdb
-
An instance of Sys::Export::Unix::UserDB, or constructor parameters for one. If defined, this will trigger name-based translations of all UID/GID values written to the destination filesystem. See "USER REMAPPING" for more details.
- tmp
-
A temporary directory where this module can prepare temporary files. If you are using a filesystem destination, it will default to the same device as the staging directory.
When finish is called, this is et to
undef
so that instance of File::Temp can clean themselves up. - on_collision
-
Specifies what to do if there is a name collision in the destination. See attribute "on_collision".
- log
-
This can either be a Log::Any instance, or a string specifying a log level such as "debug" or "trace". The default logging is on STDOUT (level 'info') and simply lists the files being copied and whether they were patched.
ATTRIBUTES
src
The root of the source filesystem. It must be the actual root used by the symlinks and library paths inside this filesystem, or things will break.
src_abs
The abs_path
of the root of the source filesystem, always ending with '/'.
src_userdb
An instance of Sys::Export::Unix::UserDB. This attribute is undef
until it is needed, unless you specified it to the constructor. See "USER REMAPPING" for details.
dst
The root of the destination filesystem, OR a coderef which receives files which are ready to be recorded. This must be the logical root of your destination filesystem, which will be used when symlinks or library paths refer to '/'. If you want to move files into a subdirectory of the logical destination filesystem, see "rewrite_path". If you provide a coderef, the signature is
sub ($exporter, $file_attrs) { ... }
dst_abs
The abs_path
of the root of the destination filesystem, always ending with '/'. This is only defined if dst is not a coderef.
dst_userdb
An instance of Sys::Export::Unix::UserDB. This attribute is undef
until it is needed, unless you specified it to the constructor. See "USER REMAPPING" for details.
tmp
The abs_path
of a directory to use for temporary staging before renaming into "dst". This must be in the same volume as dst
so that rename()
can be used to move temporary files into their dst
location.
src_path_set
A hashref of all source paths which have been processed, and which destination path they were written as. All paths are logically absolute to their respective roots, but without a leading slash.
dst_path_set
A hashref of all destination paths which have been created (as keys). If the value of the key is defined, it is the source path. If not defined, it means the destination was created without reference to a source path.
dst_uid_used
The set of numeric user IDs which have been written to dst.
dst_gid_used
The set of numeric group IDs which have been written to dst.
path_rewrite_regex
A regex that matches the longest prefix of a source path having a rewrite rule.
on_collision
Specifies what to do if there is a name collision in the destination. The default (undef) causes an exception unless the existing file is identical to the one that would be written.
Setting this to 'overwrite' will unconditionally replace files as it runs. Setting it to 'ignore' will silently ignore collisions and leave the existing file in place. Setting it to a coderef will provide you with the path and content that was about to be written to it:
$exporter->on_collision(sub ($dst_path, $fileinfo) {
# dst_path is the relative-to-dst-root path about to be written
# fileinfo is the hash of file attributes passed to ->add
return $action; # 'ignore' or 'overwrite' or 'ignore_if_same'
}
log
$exporter->log('info');
$exporter->log($logger);
Set the logging output object, or log level for 'print' output.
METHODS
rewrite_path
$exporter->rewrite_path($src_prefix, $dst_prefix);
Add a path rewrite rule which replaces occurrences of $src_prefix
with $dst_prefix
. Only one rewrite occurs per path; they don't cascade. Path prefixes refer to the logical absolute path with the source root and destination root. You may specify these prefixes with or without the leading implied '/'.
Returns $exporter
for chaining.
rewrite_user
$exporter->rewrite_user( $src_name_or_uid => $dst_name_or_uid );
If you rewrite from a UID to a UID, this doesn't consider any names, and does an efficient numeric remapping.
If src is a name, this instantiates "src_userdb" if it doesn't exist, and resolves the name (which must exist), then creates a numeric mapping.
If dst is a name, this instantiates "dst_userdb" if it doesn't exist, and resolves the name (which must exist, but gets auto-imported from src_userdb
in the default configuration) then creates a numeric mapping.
rewrite_group
$exporter->rewrite_group( $local_name_or_gid => $exported_name_or_gid );
Same semantics as "rewrite_user" but for groups.
add
$exporter->add($src_path, ...);
$exporter->add(\%file_attrs, ...);
$exporter->add([ $name, $mode, $mode_specific_data, \%other_attrs ]);
Add one or more source paths (relative to /src
) or full file specifications to the export. This immediately copies the file to the destination, also triggering a copy of any interpreters or libraries it depends on which weren't already added.
Any item with a src_path
attribute will be translated according to "rewrite_path", "rewrite_user", and "rewrite_group". This includes generating the 'name' attribute and also rewriting the contents of files and symlinks. If it is missing attributes, they will be filled-in with a call to lstat
.
Any item without a src_path
is assumed to be already rewritten by the user, and must specify at least attributes name
and mode
.
The file attributes are:
name # destination path relative to destination root
src_path # source path relative to source root, no leading '/'
data # literal data content of file (must be bytes, not unicode)
data_path # absolute path of file to load 'data' from, not limited to src dir
dev # device of origin, as per lstat
dev_major # major(dev), if you know it and don't know 'dev'
dev_minor # minor(dev), if you know it and don't know 'dev'
ino # inode, from stat. used with 'dev' for hardlink tracking
mode # permissions and type, as per stat
nlink # number of hard links
uid # user id
gid # group id
rdev # referenced device, for device nodes
rdev_major # major(rdev), if you know it and don't know 'rdev'
rdev_minor # minor(rdev), if you know it and don't know 'rdev'
size # size, in bytes. Can be ommitted if 'data' is present
mtime # modification time, as per stat
You can also use the array notation described in "expand_file_stat_array" in Sys::Export. Array-notation provides a name
attribute rather than a src_path
, so those do no get rewritten.
Returns $exporter
for chaining.
src_find
This is a helper function to build lists of source files. It iterates the "src" tree from a given subdirectory, passing each entry to a coderef filter.
@hashrefs= $exporter->src_find(@paths);
@hashrefs= $exporter->src_find($filter, @paths);
@hashrefs= $exporter->src_find(@paths, $filter);
The filter can be a coderef or Regexp-ref. Any other type is considered a path. The filter function runs in the following environment:
$_
-
the absolute path of the source file
_
-
the result of
lstat
on the absolute path of the source file (allowing file tests like -d or -f or -s without running a new stat() call) $_[0]
-
the hashref of stat attributes that will be returned by this function if the filter returns true
The callback should return a boolean of whether to include the file in the result. If it returns false for a directory, the directory will still be traversed. If you want to prune a directory tree from being processed, set $_[0]{prune}
to a true value before returning.
For a Regexp-ref, you are matching against the full absolute path within "src". If you want a regex to only apply to the relative path of a file, just write it as a sub like
sub { $_[0]{src_path} =~ /pattern/ }
skip
$exporter->skip(@paths);
$exporter->skip({ src_path => $path, ... });
Inform the exporter that it should *not* perform any actions for the specified source path, presumably because you're handling that one specially in some other way.
You may pass hashrefs generated by "src_find", which will include a src_path
field.
finish
Apply any postponed changes to the destination filesystem. For instance, this applies mtimes to directories since writing the contents of the directory would have changed the mtime.
get_dst_for_src
my $dst_path= $exporter->get_dst_for_src($src_path);
Returns the relative destination path for a relative source path, rewritten according to the rewrite rules. If no rewrites exist, this just returns $src_path
.
get_dst_uid_gid
($uid, $gid)= $exporter->get_dst_uid_gid($uid, $gid);
Given a source uid and gid, return the destination uid and gid. See "USER REMAPPING" for details.
This is the same routine used after every stat
on the source filesystem to compute the uid/gid written to dst
.
USER REMAPPING
This module tries to be helpful with rewriting UID/GID from your source filesystem to the destination filesystem, but also stay out of your way if you don't need that feature. In the simplest case, you are building an initrd from an environment with the same user database as your final system image and UID/GID can be copied as-is. In other cases, you might be pulling files from Alpine to be used for an initrd that starts a Debian system, and need to map ownership by name instead of number.
The basic rule is that name-based mapping is enabled or disabled by whether attribute "dst_userdb" is defined or not. If you pass that as an initial constructor attribute, then name-based mapping is enabled from the start. If you request a destination name in a call to "rewrite_user" or "rewrite_group", they will automatically instantiate dst_userdb
. However, you can also perform ID remapping without name databases. If every call to rewrite_user
and rewrite_group
exclusively use numbers, then the numeric mapping is handled without triggering dst_userdb
to be created.
If name mapping is enabled, then "src_userdb" must also be defined. If you don't initialize it, it will be automatically instantiated from $src/etc/passwd
, falling back to the users of the host system via getpwnam etc.
Name Mapping Behavior
Any time a new not-yet-mapped ID is encountered, it checks the src_userdb
to find out what name is associated with that ID. If not found, it may import it from getpwnam
/getgrnam
. If still not found, it dies. Then it checks for any name-baased rewrites to determine what name to look for in dst_userdb
, defaulting to the same name as src_userdb
. If dst_userdb
doesn't have that name yet, the user is copied from src_userdb
, but croaks if the UID/GID would conflict with another entry in dst_userdb
. Once the src UID/GID and dst UID/GID are both known, it adds those to the numeric mapping, so further name lookups are not needed for that source ID.
VERSION
version 0.003
AUTHOR
Michael Conrad <mike@nrdvana.net>
COPYRIGHT AND LICENSE
This software is copyright (c) 2025 by Michael Conrad.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.