NAME

LCC - Content Provider Modules for the Local Content Cache system

SYNOPSIS

use LCC;                         # basic support

use LCC::Backend::textfile;      # if using the textfile Backend
use LCC::Backend::Storable;      # if using the Storable Backend
use LCC::Backend::DBI::mysql;    # if using the DBI Backend with mysql
use LCC::Backend::DBI;           # if using the DBI Backend with other driver

use LCC::Documents::filesystem;  # if using info of documents in filesystem
use LCC::Documents::DBI;         # if using info of documents in DBI database
use LCC::Documents::module;      # if using info of documents through module
use LCC::Documents::queue;       # if using info of documents through queue

# Create basic access to module

my $lcc = LCC->new( {RaiseError => 1} );

# Create status persistency backend

$lcc->Backend;                      # default file, Storable || textfile
$lcc->Backend( '/root/LCC.gz' );    # specific file, Storable || textfile
$lcc->Backend( 'textfile','/root/LCC.gz' ); # force flat textfiles
$lcc->Backend( 'Storable','/root/LCC.gz' ); # force Storable

my $dbh = DBI->connect($data_source, $username, $auth, \%attr);
$lcc->Backend( $dbh );                      # DBI with default table
$lcc->Backend( 'DBI',$dbh );                # force DBI, default table
$lcc->Backend( [$dbh,'table'] );            # DBI with specific table
$lcc->Backend( 'DBI',[$dbh,'table'] );      # force DBI, specific table

# Specify type of update

$lcc->complete;      # force complete update, regardless of UNS
$lcc->partial;       # partial update, UNS may force complete
$lcc->partial( 1 );  # force partial update, regardless of UNS

# Create document set specification

$lcc->Documents;     # assume filesystem, use current directory
$lcc->Documents( '/usr/local/apache/htdocs' );  # assume filesystem
$lcc->Documents( 'filesystem','/htdocs' );      # force 'filesystem'

my $dbh = DBI->connect( $data_source, $username, $auth, \%attr );
my $sth = $dbh->prepare( "SELECT id,mtime FROM table" );
$lcc->Documents( $sth );                        # assume DBI
$lcc->Documents( 'DBI',$sth );                  # force 'DBI'

my $object = Module->new;
$lcc->Documents( $object );          # assume module
$lcc->Documents( 'module',$object ); # force 'module'
$lcc->Documents( $object,{method => 'next_document'} ); # set alternate method

my $queue = threads::shared::queue->new;
$lcc->Documents( $queue );                      # assume queue
$lcc->Documents( 'queue',$queue );              # force 'queue'

# Set conversion methods

my $documents = $lcc->Documents;
$documents->fetch_url( sub {"http://server.com/$_[0].html"} );
$documents->browse_url( sub {"http://server.com/f.html?$_[0].html"} );
$documents->conceptual_url( sub {$_[0]} );

# Check for changed documents in this set
#  Create Update Notification XML
#  Update the backend

if ($lcc->check) {
  print $lcc->update_notification_xml({id => 'name', password => 'password'});
  $lcc->update;
}

DESCRIPTION

Provides the Content Provider Modules for the Local Content Cache system as found on http://lococa.sourceforge.net .

See the LCC::Overview documentation for a introduction of the Local Content Caching system and an overview of how these Perl modules interact. Although that documentation is not required reading before looking at any of the other documentation, it is highly recommended that you do.

BASIC DISTRIBUTION

The following modules are part of the distribution:

LCC				base module
LCC::Backend			base class for storing local status
LCC::Backend::DBI		Backend using DBI for permanent storage
LCC::Backend::DBI::mysql	Backend using mysql for permanent storage
LCC::Backend::Storable		Backend using Storable for permanent storage
LCC::Backend::textfile		Backend using a textfile for permanent storage
LCC::Documents			base class for checking document information
LCC::Documents::DBI		Document information stored in a database
LCC::Documents::filesystem	Documents stored on a file system
LCC::Documents::module		Document information accessible by a Perl module
LCC::Documents::queue		Documents accessible by a threads::shared::queue
LCC::Overview			Overview of interaction between modules
LCC::UNS			setup connection to Update Notification Server

The following scripts are part of the distribution:

SETUP METHODS

The following methods are available for setting up the LCC object itself.

new

$lcc = LCC->new( {method => value} );

The creation of the LCC object itself is done by calling the class method "new" on the LCC package. The "new" class method accepts one input parameter: a reference to a hash or list of method-value pairs as handled by the Set method. writes the Hitlist XML to the file specified by the third input parameter.

INHERITED METHODS

The following methods are inherited from the LCC object whenever any of the sub-objects are made. This means that any setting in the LCC object of these methods, will automatically be activated in the sub-objects, that are created after calling any of these methods, in the same manner.

PrintError

$PrintError = $lccobject->PrintError;
$lccobject->PrintError( true | false | 'cluck' );

Sometimes you want your program to let you immediately know when there is an error. You can do this by calling the "PrintError" method with a true value. Each time an error occurs, a warning will be printed to STDERR informing you of the error that occurred. Check out the RaiseError method for letting your program stop with execution immediately when an error has occurred. Check out the Errors method when you want to examine errors completely under program control.

As a special debugging feature, it is also possible to specify the keyword 'cluck'. If specified, it attempts to load the standard Perl module "Carp". If that is successful, then it sets the $SIG{__WARN__} handler to call the "Carp::cluck" subroutine. This causes a stack-trace to be shown to the developer when a warning occurs, either from an internal error or because anything else executes a -warn- statement.

RaiseError

$RaiseError = $lccobject->RaiseError;
$lccobject->RaiseError( true | false | 'confess' );

Sometimes you want to have a program stop as soon as something goes wrong. By calling the "RaiseError" method with a true value, you are telling the module to immediately stop the program with an error message as soon as anything goes wrong. Check out PrintError to have each error output a warning on STDERR instead. Check out Errors if you want to examine for errors completely under your control.

As a special debugging feature, it is also possible to specify the keyword 'confess'. If specified, it attempts to load the standard Perl module "Carp". If that is successful, then it sets the $SIG{__DIE__} handler to call the "Carp::confess" subroutine. This causes a stack-trace to be shown to the developer when an error occurs, either from an internal error or because anything else executes a -die- statement.

OBJECT CREATION METHODS

The modules of the LCC family do not contain "new" methods that can be called directly. Instead, if you want to create e.g. an LCC::HTML object, you call the (instance) method "HTML" on an instantiated LCC object, e.g. "$html = $lcc->HTML;".

Here only the parameters for the object creation are documented. Any additional methods are documented in the documentation of the module itself.

Backend

$lcc->Backend;                   # use default textfile or Storable
$lcc->Backend( 'type',$source ); # specify type and source
$backend = $lcc->Backend;        # obtain local copy of object

The Backend method can be used to create the backend (way to store the status of the documents) or to obtain the LCC::Backend::xxx object that was previously created.

The first (optional) input parameter specifies the type of backend to be used. Currently the following settings are supported:

- textfile  store the status of the backend in a flat textfile
- Storable  store the status of the backend in a Storable file
- DBI       store the status of the backend in a DBI-supported database

If the first input parameter is omitted, then "Storable" will be assumed if the Storable module is already loaded. Else, "textfile" will be assumed.

The second input parameter is optional when the type is "textfile" or "Storable": a file in the form "LCC.(type)" in the current directory will then be assumed.

If specified, the function of the second input parameter depends on the type (implicitely) specified. In the case of:

- textfile  it is the absolute filename to store the status in
- Storable  it is the absolute filename to store the status in
- DBI       it is a DBI database handle

If the second input parameter is a filename, then the extension ".gz" causes the file to be written in gzipped format, which takes up less disk space and may be faster in some cases.

This method returns the LCC::Backend::xxx object that was created. Please note that the object is not a LCC::Backend object, but rather a

- textfile   LCC::Backend::textfile
- Storable   LCC::Backend::Storable
- DBI        LCC::Backend::DBI(::driver)?

object that inherits from LCC::Backend.

Documents

$lcc->Documents;                   # assume 'filesystem' and current directory
$lcc->Documents( 'type',$source ); # specify type and source
$documents = $lcc->Documents;      # obtain local copy of object

The Documents method can be used to create an access to the information of a (new) set of documents or to obtain the lastly created LCC::Documents::xxx object.

The first (optional) input parameter specifies the type of information of a set of documents. Currently the following settings are supported:

- filesystem  documents are files in a filesystem, inspect filesystem for info
- DBI         document information accessible through a DBI-statement handle
- module      document information accessible by calling a method of an object
- queue       document information accessible by a threads::shared::queue

If the first input parameter is omitted, then "filesystem" will be assumed.

The second input parameter is optional when the type is "filesystem": in that case the current directory will be assumed.

If specified, the function of the second input parameter depends on the type (implicitely) specified. In the case of:

- filesystem  top directory from which to look for files using File::Find
- DBI         DBI statement handle
- module      an instantiated object, call method "next_document"
- queue       an instantiated threads::shared::queue object

All but the "filesystem" type expect information to be returned as a list containing fields for id, mtime, length, md5, mimetype and subtype. All but the id and mtime fields are optional. A queue if supposed to contain references to lists with these values, rather than the lists themselves.

This method returns the LCC::Documents::xxx object that was created. Please note that the object is not a LCC::Documents object, but rather a

- filesystem   LCC::Documents::filesystem
- DBI          LCC::Documents::DBI
- module       LCC::Documents::module
- queue        LCC::Documents::queue

object that inherits from LCC::Documents.

UNS

$uns = $lcc->UNS( server:port | server );

Create a "LCC::UNS" object.

OTHER METHODS

The following methods are specific for the LCC object.

check

if ($lcc->check) {
# there are files for which a UN should be sent
}

Check all the Documents that have been previously specified against the Backend and set up the information to create the update_notification_xml from.

Returns the document ID's that will be in the Update Notification. Can be used in a scalar context to indicate whether an Update Notification should be done.

complete

$lcc->complete;

Force a complete Update Notification to be created by update_notification_xml.

partial

$lcc->partial;        # allow UNS to override
$lcc->partial( 1 );   # force partial update always, regardless of UNS

Indicate that a partial Update Notification should be created by update_notification_xml.

The option input parameter indicates whether a partial update should be forced even if the Update Notification Server has indicated that a complete update is requested.

update

$lcc->update;

Update the status in the Backend. This method should be called whenever the Update Notification has been successful, so that a subsequent partial update will only include documents that have been changed since this Update Notification.

update_notification_xml

$credentials = {id => 'name', password = 'password'};
$lcc->update_notification_xml( $credentials,*STDOUT ); # print
$lcc->update_notification_xml( $credentials,[$handle,*STDERR] ); # file + warn
$xml = $lcc->update_notification_xml( $credentials );  # returned in var

Create the XML for the Update Notification for the given credentials and either send this to one or more handles or return it.

The first input parameter is a reference to a hash in which the credentials (the id and password that will give you access to the Update Notification Server) are stored.

The (optional) second input parameter is either a handle or a reference to a list of handles to which the XML that is created, will be sent. The XML will only be returned if no handles are specified.

CONVENIENCE METHODS

The following methods are inheritable from the LCC module. They are intended to make life easier for the developer, and are specifically intended to be used within user scripts such as templates.

Get

($encoding,$xml) = $lccobject->Get( qw(encoding xml) );
$lccobject->Get( qw(encoding xml) ); # sets global vars $encoding and $xml

Sometimes you want to obtain the values returned by many methods from the same object. The "Get" method allows you to do just that: you specify the names of the methods to be executed on the object and the values from the method calls (without parameters) are either returned in the same order, or they are used to set global variables with the same name as the method.

If you are interested in calling multiple methods with parameters on the same object, and you are not interested in the return values, then you should call the Set method.

Set

$lccobject->Set( {
 methodname1	=> $value1,
 methodname2	=> $value2,
 methodname2	=> [parameter1,parameter2],
} );

It is often a hassle for the developer to call many methods to set parameters on the same object. To reduce this hassle, the "Set" method was developed. Instead of doing:

$lccobject->methodname1( $value1 );
$lccobject->methodname2( $value2 );
$lccobject->methodname2( $parameter1,$parameter2 );

you can do this in one go as specified above.

The "Set" method accepts either a reference to a hash (as specified by { }) or a reference to a list (as specified by [ ]). The reference to hash method is preferable if the order in which the methods are executed, is not important. If the order in which the methods are supposed to be excuted is important, then you should use the reference to a list method, e.g.:

$lccobject->Set( [
 methodname1	=> $value1,
 methodname2	=> [],	                    # no parameters to be passed
 methodname2	=> [parameter1,parameter2], # more than 1 parameter
] );

Please note that if there is one parameter to the method, you can specify it directly. If there are more than one parameter to be passed to the method, then you must specify them as a reference to a list by putting them between square brackets, i.e. "[" and "]". If no parameters need to be passed to the method, you can specify this as a reference to an empty list, i.e. "[]".

The "Set" method disregards any values that were returned by the methods. If you are interested in the values that are returned by multiple methods, you can use the Get method.

Please note that the "Set" method is used internally in almost all object creation methods to allow you to immediately specify the options to be activated for that object.

DEBUGGING METHODS

Dump

@info = $lccobject->Dump;
$lccobject->Dump;          # Data::Dumper->Dump output on object as warning

The "Dump" method is a quick-and-dirty interface to the Data::Dumper standard Perl module. When it is invoked, it will attempt to load the Data::Dumper module. If that is successful, it will create a dump of the object. If the method is called in a void context, the dump will be printed as a warning to STDERR. Else it will be returned by the "Dump" method.

No action will be performed if the Data::Dumper module can not be loaded.

Errors

if ($lccobject->Errors) {     # does not remove errors in scalar context
@error = $lccobject->Errors;  # returns errors, removes them from object

If an error occurs in the LCC family of modules, they are only reported "internally" as information added to the object. To find out whether there are any errors, you can call the "Errors" method in scalar context: it will then tell you how many errors there are. To find out what the errors exactly are, you can call the "Errors" method in list context: this then also has the side-effect of removing the error information from the object, effectively resetting the error history of the object.

If you want your program to stop as soon as an error occurs, call the RaiseError method beforehand. If you want your program to also output a warning to STDERR each time an error occurs, call the PrintError method beforehand.

EXAMPLES

using textfile and filesystem

# Load only the necessary modules

use LCC;
use LCC::Backend::textfile;     # only textfile module is needed
use LCC::Documents::filesystem; # only filesystem module is needed

# Create basic access to module, let errors cause a die

my $lcc = LCC->new( {RaiseError => 1} );

# Create the default backend

$lcc->Backend( '/usr/local/apache/LCC.status' );

# Perform a partial update

$lcc->partial;

# Specify which documents to be checked and the server name to prefix

$lcc->Documents( '/usr/local/apache/htdocs' )->server( 'www.server.com' );

# If there are new files
#  Print the Update Notification XML to STDOUT
#  Update the status to the backend

if ($lcc->check) {
  print $lcc->update_notification_xml( {id => 'name',password => 'password} );
  $lcc->update;
}

AUTHOR

Elizabeth Mattijsen, <liz@dijkmat.nl>.

maintained by LNATION, <thisusedtobeanemail@gmail.com>

Please report bugs to <perlbugs@dijkmat.nl>.

COPYRIGHT

Copyright (c) 2002 Elizabeth Mattijsen <liz@dijkmat.nl>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

http://lococa.sourceforge.net and the other LCC::xxx modules.