NAME

File::Rdiff -- generate remote signatures and patch files using librsync

SYNOPSIS

use File::Rdiff

DESCRIPTION

A more-or-less direct interface to librsync (http://librsync.sourceforge.net/).

For usage examples (better than this very sparse documentation), see the two example scripts below.

LIBRSYNC_VERSION

A constant describing the version of the rsync library used in this module. My version claimed to be "0.9.5 librsync" when I wrote this document ;)

$oldlevel = trace_level [$newlevel]

Return the current tracelevel and optionally set a new one.

$oldcb = trace_to [$newcb]

Return the current trace callback and optionally set a new one. The callback will be called with the log level as the first argument and the log message as the second one.

Calling trace_to with undef will restore the default handler (which currently prints the message to standard error).

supports_trace

Returns wether debugging traces are supported in this version of the library.

$msg = strerror $rcode

Returns a string representation of the given error code. You usually just "exit(1)" or something when a function/method fails, as all (most?) librsync functions log the error properly, so there is rarely a need to call this function.

$md4 = md4_file $fh
sig_file $old_fh, $sig_fh[, $block_len[, $strong_len]]
$sig = loadsig_file $fh
delta_file $signature, $new_fh, $delta_fh
patch_file $base_fh, $delta_fh, $new_fh

The File::Rdiff::Job class

The File::Rdiff::Buffers class

This class contains the input and output buffers for the non-blocking interface. It is slightly unusual in that it allows direct manipulation of (some) of it's internal variables.

new File::Rdiff::Buffers [$outsize]

Creates and initializes a new buffers structure. $outsize specifies the maximum number of bytes to be read into the output scalar until it is considered full. The default is 64k.

$buffers->in($in)

Set the next block of input to consume. Data will be read from this scalar (no copy will be made!) until all bytes have been consumed or a new input scalar is set.

$out = $buffers->out

Return the current output data and create a new buffer. Returns undef if no data has been accumulated.

$buffers->eof

Set the eof flag to true. This indicates that no data is following the current input scalar.

$buffers->avail_in

Returns the numer of bytes still available for input. If there are no input bytes available but the eof flag is set, returns -1 (to make boolean tests easy to check wether to supply more data easier).

$buffers->avail_out

Returns the number of bytes still available in the output buffer.

$buffers->size

The number of bytes that have been accumulated in the current buffer so far.

The File::Rdiff::Job class

It is possible to have multiple jobs running at the same time. The idea is to create job objects and then drive them incrementally with input or output data until all date has been processed.

new_sig File::Rdiff::Job [$new_block_len[, $strong_sum_len]]

Create a job that converts a base stream into a signature stream (i.e. creates signatures).

new_loadsig File::Rdiff::Job

Create a job that converts the input stream into a in-memory File::Rdiff::Signature object. The signature object can be fetched anytime with the signature-method.

new_delta File::Rdiff::Job $signature

Creates a job that creates (outputs) a delta between the input stream (the newer file) and the file represented by the given signature.

new_patch File::Rdiff::Job $callback_or_filehandle

Creates a job that patches a file according to the input stream (a delta stream). The single argument is used to read the base file contents. If it is a filehandle, it must be a seekable handle to the base file.

If it is a coderef, it will be called whenever base file data must be read. Two arguments will be passed: the file offset and the length. The callback should eithe return the data read (must be a string, not a number!) or an error code.

$job->iter($buffers)

Do as much work as possible given the input and/or output data in the File::Rdiff::Buffers structure and return either DONE when the job is finished, BLOCKED if there aren't enough bytes available in the input or output buffers (in which case you should deplete the output buffer and/or fill the input buffer and loop), or some error code indicating that the operation failed.

$job->signature

Only valid for new_loadsig, so look there.

EXAMPLE PROGRAM ONE

Very simple program that mimics librsync's rdiff, using the simple file utility functions. see example below for the same program, written using the nonblocking API.

#!/usr/bin/perl

use File::Rdiff qw(:trace :file);

trace_level(LOG_INFO);

if ($ARGV[0] eq "signature") {
   open $base, "<$ARGV[1]" or die "$ARGV[1]: $!";
   open $sig,  ">$ARGV[2]" or die "$ARGV[2]: $!";

   File::Rdiff::sig_file $base, $sig;
} elsif ($ARGV[0] eq "delta") {
   open $sig,   "<$ARGV[1]" or die "$ARGV[1]: $!";
   open $new,   "<$ARGV[2]" or die "$ARGV[2]: $!";
   open $delta, ">$ARGV[3]" or die "$ARGV[3]: $!";

   $sig = loadsig_file $sig;

   ref $sig or exit 1;

   $sig->build_hash_table;

   File::Rdiff::delta_file $sig, $new, $delta;
} elsif ($ARGV[0] eq "patch") {
   open $base,  "<$ARGV[1]" or die "$ARGV[1]: $!";
   open $delta, "<$ARGV[2]" or die "$ARGV[2]: $!";
   open $new,   ">$ARGV[3]" or die "$ARGV[3]: $!";

   File::Rdiff::patch_file $base, $delta, $new;
} else {
   print <<EOF;
$0 signature BASE SIGNATURE
$0 delta SIGNATURE NEW DELTA
$0 patch BASE DELTA NEW
EOF
   exit (1);
}

EXAMPLE PROGRAM TWO

Same as above, but written using the callback-based, "nonblocking", API.

#!/usr/bin/perl

use File::Rdiff qw(:trace :nonblocking);

trace_level(LOG_INFO);

if ($ARGV[0] eq "signature") {
   open $basis, "<", $ARGV[1]
      or die "$ARGV[1]: $!";
   open $sig, ">", $ARGV[2]
      or die "$ARGV[2]: $!";

   my $job = new_sig File::Rdiff::Job 128;
   my $buf = new File::Rdiff::Buffers 4096;

   while ($job->iter($buf) == BLOCKED) {
      # fetch more input data
      $buf->avail_in or do {
         my $in;
         65536 == sysread $basis, $in, 65536 or $buf->eof;
         $buf->in($in);
      };
      print $sig $buf->out;
   }
   print $sig $buf->out;

} elsif ($ARGV[0] eq "delta") {
   open $sig,   "<$ARGV[1]" or die "$ARGV[1]: $!";
   open $new,   "<$ARGV[2]" or die "$ARGV[2]: $!";
   open $delta, ">$ARGV[3]" or die "$ARGV[3]: $!";

   # first load the signature into memory
   my $job = new_loadsig File::Rdiff::Job;
   my $buf = new File::Rdiff::Buffers 0;

   do {
      $buf->avail_in or do {
         my $in;
         65536 == sysread $sig, $in, 65536 or $buf->eof;
         $buf->in($in);
      };
   } while $job->iter($buf) == BLOCKED;

   $sig = $job->signature;

   $sig->build_hash_table;

   # now create the delta file
   my $job = new_delta File::Rdiff::Job $sig;
   my $buf = new File::Rdiff::Buffers 65536;

   do {
      $buf->avail_in or do {
         my $in;
         65536 == sysread $new, $in, 65536 or $buf->eof;
         $buf->in($in);
      };
      print $delta $buf->out;
   } while $job->iter($buf) == BLOCKED;
   print $delta $buf->out;

} elsif ($ARGV[0] eq "patch") {
   open $base,  "<$ARGV[1]" or die "$ARGV[1]: $!";
   open $delta, "<$ARGV[2]" or die "$ARGV[2]: $!";
   open $new,   ">$ARGV[3]" or die "$ARGV[3]: $!";

   # NYI
   File::Rdiff::patch_file $base, $delta, $new;
} else {
   print <<EOF;
$0 signature BASIS SIGNATURE
$0 delta SIGNATURE NEW DELTA
$0 patch BASE DELTA NEW
EOF
   exit (1);
}

SEE ALSO

File::Rsync, rdiff1 (usage example using simple file API), rdiff2 (example using nonblocking API).

BUGS

- not well-tested so far.

- low memory will result in segfaults rather than croaks.

- no access to statistics yet

- documentation leaves much to be deserved.

AUTHOR

Marc Lehmann <schmorp@schmorp.de>
http://home.schmorp.de/