NAME
PAR::Intro - Introduction to Perl Archive Toolkit
SYNOPSIS
# This is a presentation, not a module.
Note that a more extensive tutorial is now available online as http://aut.dyndns.org/par-tutorial/ and has superceded materials in this introduction.
DESCRIPTION
What is PAR (Perl Archive Toolkit)?
Do what JAR (Java Archive) does for Perl
Platform-independent, compressed file format (zip)
Aggregates modules, scripts and other files into one file
Easy to generate, update and extract
Benefits of using PAR:
Reduced download and deployment time
Saves disk space by compression and selective packaging
Version consistency: solves forward-compatibility problems
Community support:
par@perl.org
You can also turn a PAR file into a self-contained script
Bundles all necessary 3rd-party modules with it
Requires only core Perl to run on the target machine
If you use
pp
to compile the script......you get an executable not even needing core perl
Getting Started
First, generate a PAR file with modules in it:
% zip foo.par Hello.pm % zip -r foo.par lib/ # grab all modules in lib/
Using modules stored inside a PAR file:
% perl -MPAR=./foo.par -MHello % perl -MPAR=./foo -MHello # the .par part is optional
Or put it in @INC and use it just like a directory:
% perl -MPAR -Ifoo.par -MHello % perl -MPAR -Ifoo -MHello # ditto
Command-line Tools
Use
pp
to scan scripts and store dependencies as a PAR file:% pp -p source.pl # makes 'source.par' % pp -B -p source.pl # bundles core modules too
Use
par.pl
to run files from a Perl Archive:% par.pl foo.par # looks for 'main.pl' by default % par.pl foo.par test.pl # runs script/test.pl in foo.par
Use
parl
orparl.exe
to run files from a Perl Archive:% parl foo.par % parl foo.par test.pl
Making Binary Executables
The
pp
utility can also generate binary executables:% pp -o packed.exe source.pl # self-contained .exe % packed.exe # runs anywhere with the same OS
You can also bundle additional modules:
# packs CGI + its dependencies, too % pp -o packed.exe -M CGI source.pl
Or pack one-liners:
# turns one-liner into executable % pp -o packed.exe -e 'print "Hi!"'
Some notes:
The command-line options of
pp
are almost identical toperlcc
'sModules are read directly from the PAR file, not extracted
Shared object files (aka dll) are extracted with File::Temp
Tested on Win32, FreeBSD, Linux, AIX, Solaris, Darwin and Cygwin.
The Anatomy of a PAR file
Modules can reside in different directories in a PAR file:
/lib/ # standard location /arch/ # for creating from blib/ /i386-freebsd/ # i.e. $Config{archname} /5.8.0/ # i.e. Perl version number /5.8.0/i386-freebsd/ # combination of the two above / # casual packaging only
Scripts are stored in one of the two locations:
/script/ # standard location / # casual packaging only
Shared libraries may be architecture- or perl-version-specific:
/shlib/(5.8.0/)?(i386-freebsd/)?
PAR files may recursively contain other PAR files:
/par/(5.8.0/)?(i386-freebsd/)?
Special files:
/MANIFEST # index of the PAR's contents /SIGNATURE # digital signature(s) /META.yml # dependency, license info, etc. /Build.PL # self-contained installer
Programs can use
PAR::read_file($filename)
to read file contents inside PAR
Derived Modules
Apache::PAR
Nathan Byrd's attempt to make self-contained Perl Handlers
Same as the WAR files for Java Servlets
Includes PerlRun and Registry handlers
App::Packer::Backend::PAR
Support module of Mattia Barbon's App::Packer suite
Makes it easy to pick-and-choose dependency scanners and packers
Fine-tuned distribution and packaging controls
CPANPLUS::Dist::PAR
Cross-platform PPM: Auto-generate PAR out of CPAN distributions
Use the bundled Build.PL to install PAR modules into system
Apache::PAR Demo
In
httpd.conf
:<VirtualHost *> <IfDefine MODPERL2> PerlModule Apache::ServerUtil </IfDefine> PerlModule Apache::PAR PARDir /opt/myapp PARFile /opt/myapp/myapp.par </VirtualHost>
In
web.conf
insidemyapp.par
:Alias /myapp/static/ ##PARFILE##/ <Location /myapp/static> SetHandler perl-script PerlHandler Apache::PAR::Static PerlAddVar PARStaticDirectoryIndex index.html PerlSetVar PARStaticDefaultMIME text/html </Location> Alias /myapp/cgi-perl/ ##PARFILE##/ <Location /myapp/cgi-perl> Options +ExecCGI SetHandler perl-script PerlHandler Apache::PAR::Registry </Location>
Future Development
Polish
pp
's featuresHandles corner dependency cases for LWP, Tk, DBI...
Optional encryption support (but *not* obscuring)
Become a worthy competitor to PerlApp and Perl2Exe
Learning from JAR
Making par.pl's command line interface in sync with jar's
Digital signatures for PAR packages using Module::Signature
File layout compatibility?
Learning from FreeBSD Bento
Smoke test and make PAR automatically for each CPAN upload
Provide binary packages for users without a compiler
Overview of PAR.pm's Implementation
Here begins the scary part
Grues, Dragons and Jabberwocks abound...
You are going to learn unpleasant things about Perl internals
Go home now if you have heart condition or digest problems
PAR invokes five areas of Perl arcana:
@INC code references
On-the-fly source filtering
Faking <DATA> filehandle with PerlIO::scalar and IO::Scalar
Overriding DynaLoader::bootstrap to handle XS modules
Making self-bootstrapping binary executables
The first two only works on 5.6 or later
PerlIO::scalar is 5.8-specific; IO::scalar only needs 5.005
DynaLoader and %INC are there since Perl 5 was born
PAR currently needs 5.6, but a 5.005 port is possible
Code References in @INC
On 1999-07-19, Ken Fox submitted a patch to P5P
To "enable using remote modules" by putting hooks in @INC
It's accepted to come in Perl 5.6, but only get documented by 5.8
Type 'perldoc -f require' to read the nitty-gritty details
Code references in @INC may return a filehandle, or undef to 'pass':
push @INC, \&my_sub; sub my_sub { my ($coderef, $filename) = @_; # $coderef is \&my_sub open my $fh, "wget http://example.com/$filename |"; return $fh; # using remote modules, indeed! }
Perl 5.8 let you open a file handle to a string, so we just use that:
open my $fh, '<', \($zip->memberNamed($filename)->contents); return $fh;
But Perl 5.6 does not have that, and I don't want to use temp files...
Source Filtering without Filter::* Modules
... Undocumented features to the rescue!
It turns out that @INC hooks can return *two* values
The first is still the file handle
The second is a code reference for line-by-line source filtering!
This is how
Acme::use::strict::with::pride
works:# Force all modules used to use strict and warnings open my $fh, "<", $filename or return; my @lines = ("use strict; use warnings;\n", "#line 1 \"$full\"\n"); return ($fh, sub { return 0 unless @lines; push @lines, $_; $_ = shift @lines; return length $_; });
But we don't really have a filehandle for anything
Another undocumented feature to the rescue
We can actually omit the first return value altogether:
# Return all contents line-by-line from the file inside PAR my @lines = split /(?<=\n)/, $zip->memberNamed($filename)->contents; return (sub { $_ = shift(@lines); return length $_ });
Faking the <DATA> Handle
The @INC filter stops when it sees
__END__
or__DATA__
All contents below are lost
Breaks modules that read from the <DATA> filehandle
The same problem appears when we
eval
the main.pl script
Therefore, we insert a line before the final token to fake *DATA
It has to be the final line to belong to the correct package
It has to happen in compile time but not inside a BEGIN block
Here is what I came up with (but no longer needed in recent versions):
$DATACache{$file} = $1 if ($program =~ s/^__DATA__\n?(.*)//ms); if (eval {require PerlIO::scalar; 1}) { "use PerlIO::scalar". " ( open(*DATA, '<:scalar', \\\$PAR::DATACache{'$key'}) ? () : () )"; } elsif (eval {require IO::Scalar; 1}) { # This will first load IO::Scalar, *then* tie the handles. "use IO::Scalar". " ( tie(*DATA, 'IO::Scalar', \\\$PAR::DATACache{'$key'}) ? () : () )"; } else { # only dies when it's used "use PAR (tie(*DATA, 'PAR::_data') ? () : ())\n"; } sub PAR::_data::TIEHANDLE { return bless({}, shift) } sub PAR::_data::AUTOLOAD { die "Please install IO::Scalar first!\n" }
Overriding DynaLoader::bootstrap
XS modules have dynamically loaded libraries (
.so
or.dll
)They cannot be loaded as part of a zip file, so we extract them out
But I don't want to make any temporary
auto/
directoriesSo we have to intercept DynaLoader's library-finding process
Module names are passed to
bootstrap
for XS loadingDuring the process, it calls
dl_findfile
to locate the fileSo we wrap around both functions:
no strict 'refs'; no warnings 'redefine'; $bootstrap = \&DynaLoader::bootstrap; $dl_findfile = \&DynaLoader::dl_findfile; *{'DynaLoader::bootstrap'} = \&_bootstrap; *{'DynaLoader::dl_findfile'} = \&_dl_findfile;
Our
_bootstrap
just checks if the library is in PARsIf yes, extract it to a File::Temp temp file
The file will be automatically cleaned up when the program ends
It then pass the arguments to the original
$bootstrap
Finally, our
_dl_findfile
intercepts known filenames and return it
Anatomy of a Self-Contained PAR executable
The par script ($0) itself
May be in plain-text (par.pl)
Or native executable format (par or par.exe)
Any number of embedded files
Typically used for bootstrapping PAR's various XS dependencies
Each section begins with the magic string "FILE"
Length of filename in pack('N') format and the filename (auto/.../)
File length in pack('N') and the file's content(not compressed)
One PAR file
This is just a zip file as usual
Beginning with the magic string
"PK\003\004"
Ending section
A pack('N') number of the total length of FILE and PAR sections
Finally, there must be a 8-bytes magic string:
"\012PAR.pm\012"
Self-Bootstrapping Tricks
All we can expect is a working perl interpreter
The self-contained script *must not* use any modules at all
Not even strict.pm or DynaLoader.pm
But to process PAR files, we need XS modules like Compress::Zlib
A chicken-egg problem
Solution: bundle all module and object files needed by PAR.pm
That's what the
FILE
section in the previous slide is forLoad modules to memory, and write object files to disk
Then use a local @INC hook to load them on demand
We want to minimize the amount of temporary files
First, try getting PerlIO::scalar loaded
So everything else can be in-memory
Next, try getting File::Temp loaded for better
tempfile()
Set up an END hook to unlink all temp files up to this point
Load all other bundled files
Finally we are able to look in the compressed PAR section
This can be so much easier if we have a pure-perl
inflate()
Patches welcome!
SEE ALSO
http://www.autrijus.org/par-tutorial/
http://www.autrijus.org/par-intro/ (English version)
http://www.autrijus.org/par-intro.zh/ (Chinese version)
ex::lib::zip, Acme::use::strict::with::pride
App::Packer, Apache::PAR, CPANPLUS, Module::Install
AUTHORS
Autrijus Tang <autrijus@autrijus.org>
http://par.perl.org/ is the official PAR website. You can write to the mailing list at <par@perl.org>, or send an empty mail to <par-subscribe@perl.org> to participate in the discussion.
Please submit bug reports to <bug-par@rt.cpan.org>.
COPYRIGHT
Copyright 2002, 2003 by Autrijus Tang <autrijus@autrijus.org>.
This document is free documentation; you can redistribute it and/or modify it under the same terms as Perl itself.