NAME

HTML::SummaryBasic - basic summary info from meta tags and the first para.

SYNOPSIS

use HTML::SummaryBasic;
my $p = new HTML::SummaryBasic  {
	PATH => "D:/www/leegoddard_com/essays/aiCreativity.html",
	NOT_AVAILABLE =>"There ain't none",
};
# What did we get?
foreach (keys %{$p->{SUMMARY}}){
	warn "$_ ... $p->{SUMMARY}->{$_}\n";
}

DEPENDENCIES

use HTML::TokeParser;
use HTML::HeadParser;

DESCRIPTION

Creates a hash of useful summary information from meta and body elements.

GLOBAL VARIABLES

$NOT_AVAILABLE

May be over-ridden by supplying the constructor with a field of the same name. See "THE SUMMARY STRUCTURE".

CONSTRUCTOR (new)

Accepts a hash-like structure...

PATH

Path to file to process.

SUMMARY

Filled after get_summary is called (see "METHOD get_summary" and "THE SUMMARY STRUCTURE").

FIELDS

An array of meta tag names whose content value should be placed into the respective slots of the SUMMARY field after get_summary has been called.

THE SUMMARY STRUCTURE

A field of the object which is a hash, with key/values as follows:

AUTHOR, TITLE

HTML meta tag of same names.

DESCRIPTION

Content of the meta tag of the same name.

LAST_MODIFIED_META, LAST_MODIFIED_FILE

Time since of the modification of the file, respectively according to any meta tag of the same name, and according to the file system. If the former does not exist, it takes the value of the latter.

CREATED_META, CREATED_FILE

As above, but relating to the creation date of the file.

FIRST_PARA

The first HTML p element of the document.

HEADLINE

The first h1 tag; failing that, the first h2; failing that, the value of $NOT_AVAILABLE.

PLUS...

Any meta-fields specified in the FIELDS field.

METHOD get_summary

Optionally takes an argument that over-rides and re-sets the PATH field. Otherwise uses the PATH field to get a summary and put it into the hash that is the SUMMARY field. See also "THE SUMMARY STRUCTURE".

Return 1 on success, undef on failure, setting $! with an error message.

METHOD load_file

Optionally takes an argument that over-rides and re-sets the PATH field. Otherwise uses the PATH field to load an HTML file and return a reference to a scalar full of it.

Return a reference to a scalar of HTML, or undef on failure, setting $! with an error message.

TODO

Maybe work on URI as well as file paths.

SEE ALSO

HTML::TokeParser, HTML::HeadParser.

AUTHOR

Lee Goddard (LGoddard@CPAN.org)

COPYRIGHT

Copyright 2000-2001 Lee Goddard.

This library is free software; you may use and redistribute it or modify it undef the same terms as Perl itself.

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 41:

'=item' outside of any '=over'

Around line 50:

You forgot a '=back' before '=head1'