NAME
DocSet::Doc
- A Base Document Class
SYNOPSIS
# e.g. a subclass would do
use DocSet::Doc::HTML2HTML ();
my $doc = DocSet::Doc::HTML2HTML->new(%args);
$doc->scan();
my $meta = $doc->meta();
my $toc = $doc->toc();
$doc->render();
# internal methods
$doc->src_read();
$doc->src_filter();
DESCRIPTION
This super class implement core methods for scanning a single document of a given format and rendering it into another format. It provides sub-classes with hooks that can change the default behavior. Note that this class cannot be used as it is, you have to subclass it and implement the required methods listed later.
METHODS
new
init
scan
scan the document into a parsed tree and retrieve its meta and toc data if possible.
render
render the output document and write it to its final destination.
src_read
Fetches the source of the document. The source can be read from different media, i.e. a file://, http://, relational DB or OCR :) (but these are left for subclasses to implement :)
A subclass may implement a "source" filter. For example if the source document is written in an extended POD the source filter may convert it into a standard POD. If the source includes some template directives these can be pre-processed as well.
The document's content is coming out of this class ready for parsing and converting into other formats.
meta
a simple set/get-able accessor to the meta attribute.
toc
a simple set/get-able accessor to the toc attribute
transform_src_doc
my $doc_src_path = $self->transform_src_doc($path);
search for the source doc with path of
$path
at the search paths defined by the configuration file search_paths attribute (similar to the@INC
search in Perl) and if found resolve it to a relative toabs_doc_root
path and return it. If not found return theundef
value.
ABSTRACT METHODS
These methods must be implemented by the sub-classes:
- retrieve_meta_data
-
Retrieve and set the meta data that describes the input document into the meta object attribute. Various documents may provide different meta information. The only required meta field is title.
These methods can be implemented by the sub-classes:
- src_filter
-
A subclass may want to preprocess the source document before it'll be processed. This method is called after the source has been read. By default nothing happens.
AUTHORS
Stas Bekman <stas (at) stason.org>