NAME

Compress::Zlib - Interface to zlib compression library

SYNOPSIS

use Compress::Zlib ;

($d, $status) = deflateInit( [OPT] ) ;
($out, $status) = $d->deflate($buffer) ;
($out, $status) = $d->flush() ;
$d->dict_adler() ;

($i, $status) = inflateInit( [OPT] ) ;
($out, $status) = $i->inflate($buffer) ;
$i->dict_adler() ;

$dest = compress($source) ;
$dest = uncompress($source) ;

$gz = gzopen($filename or filehandle, $mode) ;
$bytesread = $gz->gzread($buffer [,$size]) ;
$bytesread = $gz->gzreadline($line) ;
$byteswritten = $gz->gzwrite($buffer) ;
$status = $gz->gzflush($flush) ;
$status = $gz->gzclose() ;
$errstring = $gz->gzerror() ; 
$gzerrno

$dest = Compress::Zlib::memGzip($buffer) ;

$crc = adler32($buffer [,$crc]) ;
$crc = crc32($buffer [,$crc]) ;

ZLIB_VERSION

DESCRIPTION

The Compress::Zlib module provides a Perl interface to the zlib compression library (see "AUTHORS" for details about where to get zlib). Most of the functionality provided by zlib is available in Compress::Zlib.

The module can be split into two general areas of functionality, namely in-memory compression/decompression and read/write access to gzip files. Each of these areas will be discussed separately below.

DEFLATE

The interface Compress::Zlib provides to the in-memory deflate (and inflate) functions has been modified to fit into a Perl model.

The main difference is that for both inflation and deflation, the Perl interface will always consume the complete input buffer before returning. Also the output buffer returned will be automatically grown to fit the amount of output available.

Here is a definition of the interface available:

($d, $status) = deflateInit( [OPT] )

Initialises a deflation stream.

It combines the features of the zlib functions deflateInit, deflateInit2 and deflateSetDictionary.

If successful, it will return the initialised deflation stream, $d and $status of Z_OK in a list context. In scalar context it returns the deflation stream, $d, only.

If not successful, the returned deflation stream ($d) will be undef and $status will hold the exact zlib error code.

The function optionally takes a number of named options specified as -Name=>value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.

For backward compatability, it is also possible to pass the parameters as a reference to a hash containing the name=>value pairs.

The function takes one optional parameter, a reference to a hash. The contents of the hash allow the deflation interface to be tailored.

Here is a list of the valid options:

-Level

Defines the compression level. Valid values are 1 through 9, Z_BEST_SPEED, Z_BEST_COMPRESSION, and Z_DEFAULT_COMPRESSION.

The default is -Level =>Z_DEFAULT_COMPRESSION.

-Method

Defines the compression method. The only valid value at present (and the default) is -Method =>Z_DEFLATED.

-WindowBits

For a definition of the meaning and valid values for WindowBits refer to the zlib documentation for deflateInit2.

Defaults to -WindowBits =>MAX_WBITS.

-MemLevel

For a definition of the meaning and valid values for MemLevel refer to the zlib documentation for deflateInit2.

Defaults to -MemLevel =>MAX_MEM_LEVEL.

-Strategy

Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY, Z_FILTERED and Z_HUFFMAN_ONLY.

The default is -Strategy =>Z_DEFAULT_STRATEGY.

-Dictionary

When a dictionary is specified Compress::Zlib will automatically call deflateSetDictionary directly after calling deflateInit. The Adler32 value for the dictionary can be obtained by calling tht method $d-dict_adler()>.

The default is no dictionary.

-Bufsize

Sets the initial size for the deflation buffer. If the buffer has to be reallocated to increase the size, it will grow in increments of Bufsize.

The default is 4096.

Here is an example of using the deflateInit optional parameter list to override the default buffer size and compression level. All other options will take their default values.

deflateInit( -Bufsize => 300, 
             -Level => Z_BEST_SPEED  ) ;

($out, $status) = $d->deflate($buffer)

Deflates the contents of $buffer. The buffer can either be a scalar or a scalar reference. When finished, $buffer will be completely processed (assuming there were no errors). If the deflation was successful it returns the deflated output, $out, and a status value, $status, of Z_OK.

On error, $out will be undef and $status will contain the zlib error code.

In a scalar context deflate will return $out only.

As with the deflate function in zlib, it is not necessarily the case that any output will be produced by this method. So don't rely on the fact that $out is empty for an error test.

($out, $status) = $d->flush()

Finishes the deflation. Any pending output will be returned via $out. $status will have a value Z_OK if successful.

In a scalar context flush will return $out only.

Note that flushing can degrade the compression ratio, so it should only be used to terminate a decompression.

$d->dict_adler()

Returns the adler32 value for the dictionary.

Example

Here is a trivial example of using deflate. It simply reads standard input, deflates it and writes it to standard output.

use Compress::Zlib ;

binmode STDIN;
binmode STDOUT;

$x = deflateInit()
   or die "Cannot create a deflation stream\n" ;

while (<>)
{
    ($output, $status) = $x->deflate($_) ;

    $status == Z_OK
        or die "deflation failed\n" ;

    print $output ;
}

($output, $status) = $x->flush() ;

$status == Z_OK
    or die "deflation failed\n" ;

print $output ;

INFLATE

Here is a definition of the interface:

($i, $status) = inflateInit()

Initialises an inflation stream.

In a list context it returns the inflation stream, $i, and the zlib status code ($status). In a scalar context it returns the inflation stream only.

If successful, $i will hold the inflation stream and $status will be Z_OK.

If not successful, $i will be undef and $status will hold the zlib error code.

The function optionally takes a number of named options specified as -Name=>value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.

For backward compatability, it is also possible to pass the parameters as a reference to a hash containing the name=>value pairs.

The function takes one optional parameter, a reference to a hash. The contents of the hash allow the deflation interface to be tailored.

Here is a list of the valid options:

-WindowBits

For a definition of the meaning and valid values for WindowBits refer to the zlib documentation for inflateInit2.

Defaults to -WindowBits =>MAX_WBITS.

-Bufsize

Sets the initial size for the inflation buffer. If the buffer has to be reallocated to increase the size, it will grow in increments of Bufsize.

Default is 4096.

-Dictionary

The default is no dictionary.

Here is an example of using the inflateInit optional parameter to override the default buffer size.

inflateInit( -Bufsize => 300 ) ;

($out, $status) = $i->inflate($buffer)

Inflates the complete contents of $buffer. The buffer can either be a scalar or a scalar reference.

Returns Z_OK if successful and Z_STREAM_END if the end of the compressed data has been reached.

The $buffer parameter is modified by inflate. On completion it will contain what remains of the input buffer after inflation. This means that $buffer will be an empty string when the return status is Z_OK. When the return status is Z_STREAM_END the $buffer parameter will contains what (if anything) was stored in the input buffer after the deflated data stream.

This feature is needed when processing a file format that encapsulates a deflated data stream (e.g. gzip, zip).

$i->dict_adler()

Example

Here is an example of using inflate.

use Compress::Zlib ;

$x = inflateInit()
   or die "Cannot create a inflation stream\n" ;

$input = '' ;
binmode STDIN;
binmode STDOUT;

while (read(STDIN, $input, 4096))
{
    ($output, $status) = $x->inflate(\$input) ;

    print $output 
        if $status == Z_OK or $status == Z_STREAM_END ;

    last if $status != Z_OK ;
}

die "inflation failed\n"
    unless $status == Z_STREAM_END ;

COMPRESS/UNCOMPRESS

Two high-level functions are provided by zlib to perform in-memory compression. They are compress and uncompress. Two Perl subs are provided which provide similar functionality.

$dest = compress($source) ;

Compresses $source. If successful it returns the compressed data. Otherwise it returns undef.

The source buffer can either be a scalar or a scalar reference.

$dest = uncompress($source) ;

Uncompresses $source. If successful it returns the uncompressed data. Otherwise it returns undef.

The source buffer can either be a scalar or a scalar reference.

GZIP INTERFACE

A number of functions are supplied in zlib for reading and writing gzip files. This module provides an interface to most of them. In general the interface provided by this module operates identically to the functions provided by zlib. Any differences are explained below.

$gz = gzopen(filename or filehandle, mode)

This function operates identically to the zlib equivalent except that it returns an object which is used to access the other gzip methods.

As with the zlib equivalent, the mode parameter is used to specify both whether the file is opened for reading or writing and to optionally specify a a compression level. Refer to the zlib documentation for the exact format of the mode parameter.

If a reference to an open filehandle is passed in place of the filename, gzdopen will be called behind the scenes. The third example at the end of this section, gzstream, uses this feature.

$bytesread = $gz->gzread($buffer [, $size]) ;

Reads $size bytes from the compressed file into $buffer. If $size is not specified, it will default to 4096. If the scalar $buffer is not large enough, it will be extended automatically.

Returns the number of bytes actually read. On EOF it returns 0 and in the case of an error, -1.

$bytesread = $gz->gzreadline($line) ;

Reads the next line from the compressed file into $line.

Returns the number of bytes actually read. On EOF it returns 0 and in the case of an error, -1.

It is legal to intermix calls to gzread and gzreadline.

At this time gzreadline ignores the variable $/ ($INPUT_RECORD_SEPARATOR or $RS when English is in use). The end of a line is denoted by the C character '\n'.

$byteswritten = $gz->gzwrite($buffer) ;

Writes the contents of $buffer to the compressed file. Returns the number of bytes actually written, or 0 on error.

$status = $gz->gzflush($flush) ;

Flushes all pending output into the compressed file. Works identically to the zlib function it interfaces to. Note that the use of gzflush can degrade compression.

Refer to the zlib documentation for the valid values of $flush.

$gz->gzclose

Closes the compressed file. Any pending data is flushed to the file before it is closed.

$gz->gzerror

Returns the zlib error message or number for the last operation associated with $gz. The return value will be the zlib error number when used in a numeric context and the zlib error message when used in a string context. The zlib error number constants, shown below, are available for use.

Z_OK
Z_STREAM_END
Z_ERRNO
Z_STREAM_ERROR
Z_DATA_ERROR
Z_MEM_ERROR
Z_BUF_ERROR
$gzerrno

The $gzerrno scalar holds the error code associated with the most recent gzip routine. Note that unlike gzerror(), the error is not associated with a particular file.

As with gzerror() it returns an error number in numeric context and an error message in string context. Unlike gzerror() though, the error message will correspond to the zlib message when the error is associated with zlib itself, or the UNIX error message when it is not (i.e. zlib returned Z_ERRORNO).

As there is an overlap between the error numbers used by zlib and UNIX, $gzerrno should only be used to check for the presence of an error in numeric context. Use gzerror() to check for specific zlib errors. The gzcat example below shows how the variable can be used safely.

Examples

Here is an example script which uses the interface. It implements a gzcat function.

    use Compress::Zlib ;

    die "Usage: gzcat file...\n"
	unless @ARGV ;

    foreach $file (@ARGV) {
        $gz = gzopen($file, "rb") 
	    or die "Cannot open $file: $gzerrno\n" ;

        print $buffer 
            while $gz->gzread($buffer) > 0 ;
        die "Error reading from $file: $gzerrno\n" 
            if $gzerrno != Z_STREAM_END ;
    
        $gz->gzclose() ;
    }

Below is a script which makes use of gzreadline. It implements a very simple grep like script.

use Compress::Zlib ;

die "Usage: gzgrep pattern file...\n"
    unless @ARGV >= 2;

$pattern = shift ;

foreach $file (@ARGV) {
    $gz = gzopen($file, "rb") 
         or die "Cannot open $file: $gzerrno\n" ;

    while ($gz->gzreadline($_) > 0) {
        print if /$pattern/ ;
    }

    die "Error reading from $file: $gzerrno\n" 
        if $gzerrno != Z_STREAM_END ;

    $gz->gzclose() ;
}

This script, gzstream, does the opposite of the gzcat script above. It reads from standard input and writes a gzip file to standard output.

    use Compress::Zlib ;

    binmode STDOUT; # gzopen only sets it on the fd

    my $gz = gzopen(\*STDOUT, "wb")
	  or die "Cannot open stdout: $gzerrno\n" ;

    while (<>) {
        $gz->gzwrite($_) 
	    or die "error writing: $gzerrno\n" ;
    }

    $gz->gzclose ;

Compress::Zlib::memGzip

This function is used to create an in-memory gzip file. It creates a minimal gzip header.

$dest = Compress::Zlib::memGzip($buffer) ;

If successful, it returns the in-memory gzip file, otherwise it returns undef.

The buffer parameter can either be a scalar or a scalar reference.

CHECKSUM FUNCTIONS

Two functions are provided by zlib to calculate a checksum. For the Perl interface, the order of the two parameters in both functions has been reversed. This allows both running checksums and one off calculations to be done.

$crc = adler32($buffer [,$crc]) ;
$crc = crc32($buffer [,$crc]) ;

The buffer parameters can either be a scalar or a scalar reference.

If the $crc parameters is undef, the crc value will be reset.

CONSTANTS

All the zlib constants are automatically imported when you make use of Compress::Zlib.

AUTHOR

The Compress::Zlib module was written by Paul Marquess, Paul.Marquess@btinternet.com. The latest copy of the module can be found on CPAN in modules/by-module/Compress/Compress-Zlib-x.x.tar.gz.

The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu. It is available at ftp://ftp.uu.net/pub/archiving/zip/zlib* and ftp://swrinde.nde.swri.edu/pub/png/src/zlib*. Alternatively check out the zlib home page at http://quest.jpl.nasa.gov/zlib/.

Questions about zlib itself should be sent to zlib@quest.jpl.nasa.gov or, if this fails, to the addresses given for the authors above.

MODIFICATION HISTORY

0.1 2nd October 1995.

First public release of Compress::Zlib.

0.2 5th October 1995.

Fixed a minor allocation problem in Zlib.xs

0.3 12th October 1995.

Added prototype specification.

0.4 25th June 1996.

  1. Documentation update.

  2. Upgraded to support zlib 1.0.2

  3. Added dictionary interface.

  4. Fixed bug in gzreadline - previously it would keep returning the same buffer. This bug was reported by Helmut Jarausch

  5. Removed dependancy to zutil.h and so dropped support for

    DEF_MEM_LEVEL (use MAX_MEM_LEVEL instead)
    DEF_WBITS     (use MAX_WBITS instead)

0.50 19th Feb 1997

  1. Confirmed that no changes were necessary for zlib 1.0.3 or 1.0.4.

  2. The optional parameters for deflateInit and inflateInit can now be specified as an associative array in addition to a reference to an associative array. They can also accept the -Name syntax.

  3. gzopen can now optionally take a reference to an open filehandle in place of a filename. In this case it will call gzdopen.

  4. Added gzstream example script.

1.00 14 Nov 1997

  1. The following functions can now take a scalar reference in place of a scalar for their buffer parameters:

    compress
    uncompress
    deflate
    inflate
    crc32
    adler32

    This should mean applications that make use of the module don't have to copy large buffers around.

  2. Normally the inflate method consumes all of the input buffer before returning. The exception to this is when inflate detects the end of the stream (Z_STREAM_END). In this case the input buffer need not be completely consumed. To allow processing of file formats that embed a deflation stream (e.g. zip, gzip), the inflate method now sets the buffer parameter to be what remains after inflation.

    When the return status is Z_STREAM_END, it will be what remains of the buffer (if any) after deflation. When the status is Z_OK it will be an empty string.

    This change means that the buffer parameter must be a lvalue.

  3. Fixed crc32 and adler32. They were both very broken.

  4. 4,

    Added the Compress::Zlib::memGzip function.

1.01 23 Nov 1997

  1. A number of fixes to the test suite and the example scripts to allow them to work under win32. All courtesy of Gurusamy Sarathy.

1.02 10 Jan 1998

  1. The return codes for gzread, gzreadline and gzwrite were documented incorrectly as returning a status code.

  2. The test harness was missing a "gzclose". This caused problem showed up on an amiga. Thanks to Erik van Roode for reporting this one.

  3. Patched zlib.t for OS/2. Thanks to Ilya Zakharevich for the patch.

1.03 17 Mar 1999

  1. Updated to use the new PL_ symbols. Means the module can be built with Perl 5.005_5*

1.04 27 May 1999

  1. Bug 19990527.001: compress(undef) core dumps -- Fixed.

3 POD Errors

The following errors were encountered while parsing the POD:

Around line 305:

=cut found outside a pod block. Skipping to next block.

Around line 974:

Expected '=item 4'

Around line 1020:

You forgot a '=back' before '=head2'