NAME

Sys::Export::Unix - Export subsets of a UNIX system

SYNOPSIS

use Sys::Export::Unix;
my $exporter= Sys::Export::Unix->new(
  src => '/', dst => '/initrd'
  rewrite_paths => {
    'sbin'     => 'bin',
    'usr/bin'  => 'bin',
    'usr/sbin' => 'bin',
    'usr/lib'  => 'lib',
  },
);
$exporter->add('bin/busybox');

DESCRIPTION

This object contains the logic for exporting unix-style systems.

CONSTRUCTORS

new

Sys::Export::Unix->new(\%attributes); # hashref
Sys::Export::Unix->new(%attributes);  # key/value list

Required attributes:

src

The root of the system to export from (often '/', but you must specify this)

dst

The root of the exported system. This directory must exist, and should be empty unless you specify 'on_conflict'.

It can also be an object with 'add' and 'finish' methods, which avoids the entire construction of a staging directory, and doesn't require root permission to operate.

Options:

rewrite_path

Convenience for calling "rewrite_path" using a hashref of { src => dst } pairs.

rewrite_user

Convenience for calling "rewrite_user" using a hashref of { src => dst } pairs.

rewrite_group

Convenience for calling "rewrite_group" using a hashref of { src => dst } pairs.

src_userdb

An instance of Sys::Export::Unix::UserDB, or constructor parameters for one. The default is to read $src/etc/passwd, or fall back to the getpwnam function of the host. See "USER REMAPPING" for more details.

dst_userdb

An instance of Sys::Export::Unix::UserDB, or constructor parameters for one. If defined, this will trigger name-based translations of all UID/GID values written to the destination filesystem. See "USER REMAPPING" for more details.

tmp

A temporary directory where this module can prepare temporary files. If you are using a filesystem destination, it will default to the same device as the staging directory.

When finish is called, this is et to undef so that instance of File::Temp can clean themselves up.

on_collision

Specifies what to do if there is a name collision in the destination. The default (undef) causes an exception unless the existing file is identical to the one that would be written.

Setting this to 'overwrite' will unconditionally replace files as it runs. Setting it to 'ignore' will silently ignore collisions and leave the existing file in place. Setting it to a coderef will provide you with the path and content thata was about to be written to it:

on_collision => sub ($exporter, $fileinfo, $prev_src_path) {
  # src_path is relative to $exporter->src
  # dst_path is relative to $exporter->dst
  # content_ref is a scalar ref with the new contents of the file, possibly rewritten
log

This can either be a Log::Any instance, or a string specifying a log level such as "debug" or "trace". The default logging is on STDOUT (level 'info') and simply lists the files being copied and whether they were patched.

ATTRIBUTES

src

The root of the source filesystem. It must be the actual root used by the symlinks and library paths inside this filesystem, or things will break.

src_abs

The abs_path of the root of the source filesystem, always ending with '/'.

src_userdb

An instance of Sys::Export::Unix::UserDB. This attribute is undef until it is needed, unless you specified it to the constructor. See "USER REMAPPING" for details.

dst

The root of the destination filesystem, OR a coderef which receives files which are ready to be recorded. This must be the logical root of your destination filesystem, which will be used when symlinks or library paths refer to '/'. If you want to move files into a subdirectory of the logical destination filesystem, see "rewrite_path". If you provide a coderef, the signature is

sub ($exporter, $file_attrs) { ... }

dst_abs

The abs_path of the root of the destination filesystem, always ending with '/'. This is only defined if dst is not a coderef.

dst_userdb

An instance of Sys::Export::Unix::UserDB. This attribute is undef until it is needed, unless you specified it to the constructor. See "USER REMAPPING" for details.

tmp

The abs_path of a directory to use for temporary staging before renaming into "dst". This must be in the same volume as dst so that rename() can be used to move temporary files into their dst location.

src_path_set

A hashref of all source paths which have been processed, and which destination path they were written as. All paths are logically absolute to their respective roots, but without a leading slash.

dst_path_set

A hashref of all destination paths which have been created (as keys). If the value of the key is defined, it is the source path. If not defined, it means the destination was created without reference to a source path.

dst_uid_used

The set of numeric user IDs which have been written to dst.

dst_gid_used

The set of numeric group IDs which have been written to dst.

path_rewrite_regex

A regex that matches the longest prefix of a source path having a rewrite rule.

log

$exporter->log('info');
$exporter->log($logger);

Set the logging output object, or log level for 'print' output.

METHODS

rewrite_path

$exporter->rewrite_path($src_prefix, $dst_prefix);

Add a path rewrite rule which replaces occurrences of $src_prefix with $dst_prefix. Only one rewrite occurs per path; they don't cascade. Path prefixes refer to the logical absolute path with the source root and destination root. You may specify these prefixes with or without the leading implied '/'.

Returns $exporter for chaining.

rewrite_user

$exporter->rewrite_user( $src_name_or_uid => $dst_name_or_uid );

If you rewrite from a UID to a UID, this doesn't consider any names, and does an efficient numeric remapping.

If src is a name, this instantiates "src_userdb" if it doesn't exist, and resolves the name (which must exist), then creates a numeric mapping.

If dst is a name, this instantiates "dst_userdb" if it doesn't exist, and resolves the name (which must exist, but gets auto-imported from src_userdb in the default configuration) then creates a numeric mapping.

rewrite_group

$exporter->rewrite_group( $local_name_or_gid => $exported_name_or_gid );

Same semantics as "rewrite_user" but for groups.

add

$exporter->add($src_path, ...);
$exporter->add(\%file_attrs, ...);

Add a source path (logically absolute with respect to /src) to the export. This immediately copies the file to the destination, possibly rewriting paths within it, and then triggering a copy of any libraries or interpreters it depends on.

If specified directly, file attributes are:

name            # destination path relative to destination root
src_path        # source path relative to source root, no leading '/'
data            # literal data content of file (must be bytes, not unicode)
data_path       # absolute path of file to load 'data' from
dev             # device, from stat
dev_major       # major(dev), if you know it and don't know 'dev'
dev_minor       # minor(dev), if you know it and don't know 'dev'
ino             # inode, from stat
mode            # permissions and type, as per stat
nlink           # number of hard links
uid             # user id
gid             # group id
rdev            # referenced device, for device nodes
rdev_major      # major(rdev), if you know it and don't know 'rdev'
rdev_minor      # minor(rdev), if you know it and don't know 'rdev'
size            # size, in bytes.  Can be ommitted if 'data' is present
mtime           # modification time, as per stat

If you don't specify src_path, path rewrites will not be applied to the contents of the file or symlink (on the assumption that you used paths relative to the destination).

Returns $exporter for chaining.

skip

$exporter->skip($src_path);

Inform the exporter that it should *not* perform any actions for the specified source path, presumably because you're handling that one specially in some other way.

finish

Apply any postponed changes to the destination filesystem. For instance, this applies mtimes to directories since writing the contents of the directory would have changed the mtime.

get_dst_for_src

my $dst_path= $exporter->get_dst_for_src($src_path);

Returns the relative destination path for a relative source path, rewritten according to the rewrite rules. If no rewrites exist, this just returns $src_path.

get_dst_uid_gid

($uid, $gid)= $exporter->get_dst_uid_gid($uid, $gid);

Given a source uid and gid, return the destination uid and gid. See "USER REMAPPING" for details.

This is the same routine used after every stat on the source filesystem to compute the uid/gid written to dst.

USER REMAPPING

This module tries to be helpful with rewriting UID/GID from your source filesystem to the destination filesystem, but also stay out of your way if you don't need that feature. In the simplest case, you are building an initrd from an environment with the same user database as your final system image and UID/GID can be copied as-is. In other cases, you might be pulling files from Alpine to be used for an initrd that starts a Debian system, and need to map ownership by name instead of number.

The basic rule is that name-based mapping is enabled or disabled by whether attribute "dst_userdb" is defined or not. If you pass that as an initial constructor attribute, then name-based mapping is enabled from the start. If you request a destination name in a call to "rewrite_user" or "rewrite_group", they will automatically instantiate dst_userdb. However, you can also perform ID remapping without name databases. If every call to rewrite_user and rewrite_group exclusively use numbers, then the numeric mapping is handled without triggering dst_userdb to be created.

If name mapping is enabled, then "src_userdb" must also be defined. If you don't initialize it, it will be automatically instantiated from $src/etc/passwd, falling back to the users of the host system via getpwnam etc.

Name Mapping Behavior

Any time a new not-yet-mapped ID is encountered, it checks the src_userdb to find out what name is associated with that ID. If not found, it may import it from getpwnam/getgrnam. If still not found, it dies. Then it checks for any name-baased rewrites to determine what name to look for in dst_userdb, defaulting to the same name as src_userdb. If dst_userdb doesn't have that name yet, the user is copied from src_userdb, but croaks if the UID/GID would conflict with another entry in dst_userdb. Once the src UID/GID and dst UID/GID are both known, it adds those to the numeric mapping, so further name lookups are not needed for that source ID.

VERSION

version 0.001

AUTHOR

Michael Conrad <mike@nrdvana.net>

COPYRIGHT AND LICENSE

This software is copyright (c) 2025 by Michael Conrad.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.