NAME
HTML::Toc - Generate, insert and update HTML Table of Contents.
DESCRIPTION
Generate, insert and update HTML Table of Contents (ToC).
Introduction
The HTML::Toc consists out of the following packages:
HTML::Toc
HTML::TocGenerator
HTML::TocInsertor
HTML::TocUpdator
HTML::Toc is the object which will eventually hold the Table of Contents. HTML::TocGenerator does the actual generation of the ToC. HTML::TocInsertor handles the insertion of the ToC in the source. HTML::TocUpdator takes care of updating previously inserted ToCs.
HTML::Parser is the base object of HTML::TocGenerator, HTML::TocInsertor and HTML::TocUpdator. Each of these objects uses its predecessor as its ancestor, as shown in the UML diagram underneath:
+---------------------+
| HTML::Parser |
+---------------------+
+---------------------+
| +parse() |
| +parse_file() |
+----------+----------+
/_\
|
+----------+----------+ <<uses>> +-----------+
| HTML::TocGenerator + - - - - - -+ HTML::Toc |
+---------------------+ +-----------+
+---------------------+ +-----------+
| +extend() | | +clear() |
| +extendFromFile() | | +format() |
| +generate() | +-----+-----+
| +generateFromFile() | :
+----------+----------+ :
/_\ :
| :
+----------+----------+ <<uses>> :
| HTML::TocInsertor + - - - - - - - - -+
+---------------------+ :
+---------------------+ :
| +insert() | :
| +insertIntoFile() | :
+----------+----------+ :
/_\ :
| :
+----------+----------+ <<uses>> :
| HTML::TocUpdator + - - - - - - - - -+
+---------------------+
+---------------------+
| +insert() |
| +insertIntoFile() |
| +update() |
| +updateFile() |
+---------------------+
When generating a ToC you'll have to decide which object you want to use:
TocGenerator:
for generating a ToC without inserting the ToC into the source
TocInsertor:
for generating a ToC and inserting the ToC into the source
TocUpdator:
for generating and inserting a ToC, removing any previously
inserted ToC elements
Thus in tabular view, each object is capable of:
generating inserting updating
---------------------------------
TocGenerator X
TocInsertor X X
TocUpdator X X X
Generating
The code underneath will generate a ToC of the HTML headings <h1
>..<h6
> from a file index.htm
:
use HTML::Toc;
use HTML::TocGenerator;
my $toc = HTML::Toc->new();
my $tocGenerator = HTML::TocGenerator->new();
$tocGenerator->generateFromFile($toc, 'index.htm');
print $toc->format();
For example, with index.htm
containing:
<html>
<body>
<h1>Chapter</h1>
</body>
</html>
the output will be:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#h-1">Chapter</a></li>
</ul>
<!-- End of generated Table of Contents -->
Inserting
This code will generate a ToC of HTML headings <h1
>..<h6
> of file index.htm
, and insert the ToC after the <body
> tag at the same time:
use HTML::Toc;
use HTML::TocInsertor;
my $toc = HTML::Toc->new();
my $tocInsertor = HTML::TocInsertor->new();
$tocInsertor->insertIntoFile($toc, 'index.htm');
For example, with index.htm
containing:
<html>
<body>
<h1>Chapter</h1>
</body>
</html>
the output will be:
<html>
<body>
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#h-1">Chapter</a></li>
</ul>
<!-- End of generated Table of Contents -->
<h1><a name="h-1"></a>Chapter</h1>
</body>
</html>
Inserting into string
By default, HTML::TocInsertor::insert()
prints both the string and the generated ToC to standard output. To actually insert the ToC in the string, use the output option to specify a scalar reference to insert the ToC into:
use HTML::Toc;
use HTML::TocInsertor;
my $toc = HTML::Toc->new();
my $tocInsertor = HTML::TocInsertor->new();
$html =<<HTML;
<html>
<body>
<h1>Chapter</h1>
</body>
</html>
HTML
$tocInsertor->insert($toc, $html, {'output' => \$html});
print $html;
Now the output will be:
<html>
<body>
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#h-1">Chapter</a></li>
</ul>
<!-- End of generated Table of Contents -->
<h1><a name="h-1"></a>Chapter</h1>
</body>
</html>
Inserting with update tokens
If you're planning to update the inserted ToC, you'd better use TocUpdator
to insert the ToC. TocUpdator
marks the inserted ToC elements with update tokens. These update tokens allow TocUpdator
to identify and remove the ToC elements during a future update session. This code uses TocUpdator
instead of TocInsertor
:
use HTML::Toc;
use HTML::TocUpdator;
my $toc = HTML::Toc->new();
my $tocUpdator = HTML::TocUpdator->new();
$tocUpdator->insertIntoFile($toc, 'index.htm');
When applying the code above on 'index.htm':
<html>
<body>
<h1>
Chapter
</h1>
</body>
</html>
the output will contain additional update tokens:
<!-- #BeginToc -->
<!-- #EndToc -->
<!-- #BeginTocAnchorNameBegin -->
<!-- #EndTocAnchorNameBegin -->
<!-- #BeginTocAnchorNameEnd -->
<!-- #EndTocAnchorNameEnd -->
around the inserted ToC elements:
<html>
<body><!-- #BeginToc -->
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#h-1"> Chapter </a></li>
</ul>
<!-- End of generated Table of Contents -->
<!-- #EndToc -->
<h1><!-- #BeginTocAnchorNameBegin --><a name="h-1"></a><!-- #EndTocAnchorNameBegin -->
Chapter
</h1>
</body>
</html>
Instead of HTML::TocUpdator::insertIntoFile
you can also use HTML::TocUpdator::updateFile()
. HTML::TocUpdator::updateFile()
will also insert the ToC, whether there is a ToC already inserted or not.
Updating
This code will generate a ToC of HTML headings <h1
>..<h6
> of file indexToc.htm
, and insert or update the ToC after the <body
> tag at the same time:
use HTML::Toc;
use HTML::TocUpdator;
my $toc = HTML::Toc->new();
my $tocUpdator = HTML::TocUpdator->new();
$tocUpdator->updateFile($toc, 'indexToc.htm');
For example, with indexToc.htm
containing:
<html>
<body><!-- #BeginToc -->
foo
<!-- #EndToc -->
<!-- #BeginTocAnchorNameBegin -->bar<!-- #EndTocAnchorNameBegin --><h1>
Chapter
</h1><!-- #BeginTocAnchorNameEnd -->foo<!-- #EndTocAnchorNameEnd -->
</body>h
</html>
the output will be:
<html>
<body><!-- #BeginToc -->
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#h-1"> Chapter </a></li>
</ul>
<!-- End of generated Table of Contents -->
<!-- #EndToc -->
<h1><!-- #BeginTocAnchorNameBegin --><a name="h-1"></a><!-- #EndTocAnchorNameBegin -->
Chapter
</h1>
</body>
</html>
All text between the update tokens will be replaced. So be warned: all manual changes made to text between update tokens will be removed unrecoverable after calling HTML::TocUpdator::update()
or HTML::TocUpdator::updateFile()
.
Formatting
The ToC isn't generated all at once. There are two stages involved: generating and formatting. Generating the ToC actually means storing a preliminary ToC in HTML::Toc->{_toc}
. This preliminary, tokenized ToC has to be turned into something useful by calling HTML::Toc->format()
. For an example, see paragraph 'Generating'.
Advanced
The ToC generation can be modified in a variety of ways. The following paragraphs each explain a single modification. An example of most of the modifications can be found in the manualTest.t
test file. Within this test, a manual containing:
preface
introduction
table of contents
table of figures
table of tables
parts
chapters
appendixes
bibliography
is formatted all at once.
Using attribute value as ToC text
Normally, the ToC will be made of text between specified ToC tokens. It's also possible to use the attribute value of a token as a ToC text. This can be done by specifying the attribute marked with an attributeToTocToken within the tokenBegin token. For example, suppose you want to generate a ToC of the alt
attributes of the following image tokens:
<body>
<img src=test1.gif alt="First picture">
<img src=test2.gif alt="Second picture">
</body>
This would be the code:
use HTML::Toc;
use HTML::TocInsertor;
my $toc = HTML::Toc->new();
my $tocInsertor = HTML::TocInsertor->new();
$toc->setOptions({
'tokenToToc' => [{
'groupId' => 'image',
'tokenBegin' => '<img alt=@>'
}],
});
$tocInsertor->insertIntoFile($toc, $filename);
and the output will be:
<body>
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#image-1">First picture</a></li>
<li><a href="#image-2">Second picture</a></li>
</ul>
<!-- End of generated Table of Contents -->
<a name="image-1"></a><img src="test1.gif" alt="First picture">
<a name="image-2"></a><img src="test2.gif" alt="Second picture">
</body>
Generate single ToC of multiple files
Besides generating a ToC of a single file, it's also possible to generate a single ToC of multiple files. This can be done by specifying either an array of files as the file argument and/or by extending an existing ToC.
Specify an array of files
For example, suppose you want to generate a ToC of both doc1.htm
:
<body>
<h1>Chapter of document 1</h1>
</body>
and doc2.htm
:
<body>
<h1>Chapter of document 2</h1>
</body>
Here's the code to do so by specifying an array of files:
use HTML::Toc;
use HTML::TocGenerator;
my $toc = HTML::Toc->new();
my $tocGenerator = HTML::TocGenerator->new();
$toc->setOptions({'doLinkToFile' => 1});
$tocGenerator->generateFromFile($toc, ['doc1.htm', 'doc2.htm']);
print $toc->format();
And the output will be:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="doc1.htm#h-1">Chapter of document 1</a></li>
<li><a href="doc2.htm#h-2">Chapter of document 2</a></li>
</ul>
<!-- End of generated Table of Contents -->
Extend an existing ToC
It's also possible to extend an existing ToC. For example, suppose we want the generate a ToC of file doc1.htm
:
<body>
<h1>Chapter of document 1</h1>
</body>
and extend this ToC with text from doc2.htm
:
<body>
<h1>Chapter of document 2</h1>
</body>
Here's the code to do so:
use HTML::Toc;
use HTML::TocGenerator;
my $toc = HTML::Toc->new();
my $tocGenerator = HTML::TocGenerator->new();
$toc->setOptions({'doLinkToFile' => 1});
$tocGenerator->generateFromFile($toc, 'doc1.htm');
$tocGenerator->extendFromFile($toc, 'doc2.htm');
print $toc->format();
And the output will be:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="doc1.htm#h-1">Chapter of document 1</a></li>
<li><a href="doc2.htm#h-2">Chapter of document 2</a></li>
</ul>
<!-- End of generated Table of Contents -->
Generate multiple ToCs
It's possible to generate multiple ToCs at once by specifying a HTML::Toc
object array as the ToC argument. For example, suppose you want to generate a default ToC of HTML headings h1..h6 as well as a ToC of the alt
image attributes of the following text:
<body>
<h1>Header One</h1>
<img src="test1.gif" alt="First picture">
<h2>Paragraph One</h2>
<img src="test2.gif" alt="Second picture">
</body>
Here's how you would do so:
use HTML::Toc;
use HTML::TocInsertor;
my $toc1 = HTML::Toc->new();
my $toc2 = HTML::Toc->new();
my $tocInsertor = HTML::TocInsertor->new();
$toc2->setOptions({
'tokenToToc' => [{
'groupId' => 'image',
'tokenBegin' => '<img alt=@>'
}],
});
$tocInsertor->insertIntoFile([$toc1, $toc2], $filename);
And the output will be:
<body>
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#h-1">Header One</a>
<ul>
<li><a href="#h-1.1">Paragraph One</a></li>
</ul>
</li>
</ul>
<!-- End of generated Table of Contents -->
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#image-1">First picture</a>
<li><a href="#image-2">Second picture</a>
</ul>
<!-- End of generated Table of Contents -->
<h1><a name="h-1"></a>Header One</h1>
<a name="image-1"></a><img src="test1.gif" alt="First picture"/>
<h2><a name="h-1.1"></a>Paragraph One</h2>
<a name="image-2"></a><img src="test2.gif" alt="Second picture"/>
</body>
Generate multiple groups in one ToC
You may want to generate a ToC consisting of multiple ToC groups.
Specify an additional 'Appendix' group
Suppose you want to generate a ToC with one group for the normal headings, and one group for the appendix headings, using this source file:
<body>
<h1>Chapter</h1>
<h2>Paragraph</h2>
<h3>Subparagraph</h3>
<h1>Chapter</h1>
<h1 class="appendix">Appendix Chapter</h1>
<h2 class="appendix">Appendix Paragraph</h2>
</body>
With the code underneath:
use HTML::Toc;
use HTML::TocInsertor;
my $toc = HTML::Toc->new();
my $tocInsertor = HTML::TocInsertor->new();
$toc->setOptions({
'tokenToToc' => [{
'tokenBegin' => '<h1 class="-appendix">'
}, {
'tokenBegin' => '<h2 class="-appendix">',
'level' => 2
}, {
'groupId' => 'appendix',
'tokenBegin' => '<h1 class="appendix">',
}, {
'groupId' => 'appendix',
'tokenBegin' => '<h2 class="appendix">',
'level' => 2
}]
});
$tocInsertor->insertIntoFile($toc, $filename);
the output will be:
<body>
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#h-1">Chapter</a>
<ul>
<li><a href="#h-1.1">Paragraph</a></li>
</ul>
</li>
<li><a href="#h-2">Chapter</a></li>
</ul>
<ul>
<li><a href="#appendix-1">Appendix Chapter</a>
<ul>
<li><a href="#appendix-1.1">Appendix Paragraph</a></li>
</ul>
</li>
</ul>
<!-- End of generated Table of Contents -->
<h1><a name="h-1"></a>Chapter</h1>
<h2><a name="h-1.1"></a>Paragraph</h2>
<h3>Subparagraph</h3>
<h1><a name="h-2"></a>Chapter</h1>
<h1 class="appendix"><a name="appendix-1"></a>Appendix Chapter</h1>
<h2 class="appendix"><a name="appendix-1.1"></a>Appendix Paragraph</h2>
</body>
Specify an additional 'Part' group
Suppose you want to generate a ToC of a document which is divided in multiple parts like this file underneath:
<body>
<h1 class="part">First Part</h1>
<h1>Chapter</h1>
<h2>Paragraph</h2>
<h1 class="part">Second Part</h1>
<h1>Chapter</h1>
<h2>Paragraph</h2>
</body>
With the code underneath:
use HTML::Toc;
use HTML::TocInsertor;
my $toc = HTML::Toc->new();
my $tocInsertor = HTML::TocInsertor->new();
$toc->setOptions({
'doNumberToken' => 1,
'tokenToToc' => [{
'tokenBegin' => '<h1 class="-part">'
}, {
'tokenBegin' => '<h2 class="-part">',
'level' => 2,
}, {
'groupId' => 'part',
'tokenBegin' => '<h1 class="part">',
'level' => 1,
'doNumberToken' => 1,
'numberingStyle' => 'upper-alpha'
}]
});
$tocInsertor->insertIntoFile($toc, $filename);
the output will be:
<body>
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="#part-A">First Part</a></li>
</ul>
<ul>
<li><a href="#h-1">Chapter</a>
<ul>
<li><a href="#h-1.1">Paragraph</a></li>
</ul>
</li>
</ul>
<ul>
<li><a href="#part-B">Second Part</a></li>
</ul>
<ul>
<li><a href="#h-2">Chapter</a>
<ul>
<li><a href="#h-2.1">Paragraph</a></li>
</ul>
</li>
</ul>
<!-- End of generated Table of Contents -->
<h1 class="part"><a name="part-A"></a>A First Part</h1>
<h1><a name="h-1"></a>Chapter</h1>
<h2><a name="h-1.1"></a>Paragraph</h2>
<h1 class="part"><a name="part-B"></a>B Second Part</h1>
<h1><a name="h-2"></a>Chapter</h1>
<h2><a name="h-2.1"></a>Paragraph</h2>
</body>
Number ToC entries
By default, the generated ToC will list its entries unnumbered. If you want to number the ToC entries, two options are available. Either you can specify a numbered list by modifying templateLevelBegin and templateLevelEnd. Or when the ToC isn't a simple numbered list, you can use the numbers generated by HTML::TocGenerator.
Specify numbered list
By modifying templateLevelBegin and templateLevelEnd you can specify a numbered ToC:
use HTML::Toc;
use HTML::TocGenerator;
my $toc = HTML::Toc->new();
my $tocGenerator = HTML::TocGenerator->new();
$toc->setOptions({
'templateLevelBegin' => '"<ol>\n"',
'templateLevelEnd' => '"</ol>\n"',
});
$tocGenerator->generateFromFile($toc, 'index.htm');
print $toc->format();
For instance with the original file containing:
<body>
<h1>Chapter</h1>
<h2>Paragraph</h2>
</body>
The formatted ToC now will contain ol
instead of ul
tags:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ol>
<li><a href="#h-1">Chapter</a>
<ol>
<li><a href="#h-1.1">Paragraph</a></li>
</ol>
</li>
</ol>
<!-- End of generated Table of Contents -->
See also: Using CSS for ToC formatting.
Use generated numbers
Instead of using the HTML ordered list (OL), it's also possible to use the generated numbers to number to ToC nodes. This can be done by modifying templateLevel:
use HTML::Toc;
use HTML::TocGenerator;
my $toc = HTML::Toc->new();
my $tocGenerator = HTML::TocGenerator->new();
$toc->setOptions({
'templateLevel' => '"<li>$node $text"',
});
$tocGenerator->generateFromFile($toc, 'index.htm');
print $toc->format();
For instance with the original file containing:
<body>
<h1>Chapter</h1>
<h2>Paragraph</h2>
</body>
The formatted ToC now will have the node numbers hard-coded:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li>1 <a href=#h-1>Chapter</a>
<ul>
<li>1.1 <a href=#h-1.1>Paragraph</a></li>
</ul>
</li>
</ul>
<!-- End of generated Table of Contents -->
See also: Using CSS for ToC formatting.
Using CSS for ToC formatting
Suppose you want to display a ToC with upper-alpha numbered appendix headings. To accomplish this, you can specify a CSS style within the source document:
<html>
<head>
<style type="text/css">
ol.toc_appendix1 { list-style-type: upper-alpha }
</style>
</head>
<body>
<h1>Appendix</h1>
<h2>Appendix Paragraph</h2>
<h1>Appendix</h1>
<h2>Appendix Paragraph</h2>
</body>
</html>
Here's the code:
my $toc = new HTML::Toc;
my $tocInsertor = new HTML::TocInsertor;
$toc->setOptions({
'templateLevelBegin' => '"<ol class=toc_$groupId$level>\n"',
'templateLevelEnd' => '"</ol>\n"',
'doNumberToken' => 1,
'tokenToToc' => [{
'groupId' => 'appendix',
'tokenBegin' => '<h1>',
'numberingStyle' => 'upper-alpha'
}, {
'groupId' => 'appendix',
'tokenBegin' => '<h2>',
'level' => 2,
}]
});
$tocInsertor->insertIntoFile($toc, $filename);
Which whill result in the following output:
<html>
<head>
<style type="text/css">
ol.toc_appendix1 { list-style-type: upper-alpha }
</style>
</head>
<body>
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ol class="toc_appendix1">
<li><a href="#appendix-A">Appendix</a>
<ol class="toc_appendix2">
<li><a href="#appendix-A.1">Appendix Paragraph</a></li>
</ol>
</li>
<li><a href="#appendix-B">Appendix</a>
<ol class="toc_appendix2">
<li><a href="#appendix-B.1">Appendix Paragraph</a></li>
</ol>
</li>
</ol>
<!-- End of generated Table of Contents -->
<h1><a name="appendix-A"></a>A Appendix</h1>
<h2><a name="appendix-A.1"></a>A.1 Appendix Paragraph</h2>
<h1><a name="appendix-B"></a>B Appendix</h1>
<h2><a name="appendix-B.1"></a>B.1 Appendix Paragraph</h2>
</body>
</html>
Creating site map
Suppose you want to generate a table of contents of the <title> tags of the files in the following directory structure:
path file
. index.htm, <title>Main</title>
|- SubDir1 index.htm, <title>Sub1</title>
| |- SubSubDir1 index.htm, <title>SubSub1</title>
|
|- SubDir2 index.htm, <title>Sub2</title>
| |- SubSubDir1 index.htm, <title>SubSub1</title>
| |- SubSubDir2 index.htm, <title>SubSub2</title>
|
|- SubDir3 index.htm, <title>Sub3</title>
By specifying 'fileSpec' which determine how many slashes (/) each file may contain for a specific level:
use HTML::Toc;
use HTML::TocGenerator;
use File::Find;
my $toc = HTML::Toc->new;
my $tocGenerator = HTML::TocGenerator->new;
my @fileList;
sub wanted {
# Add file to 'fileList' if extension matches '.htm'
push (@fileList, $File::Find::name) if (m/\.htm$/);
}
$toc->setOptions({
'doLinkToFile' => 1,
'templateAnchorName' => '""',
'templateAnchorHref' => '"<a href=$file"."#".$groupId.$level.">"',
'doLinkTocToToken' => 1,
'tokenToToc' => [{
'groupId' => 'dir',
'level' => 1,
'tokenBegin' => '<title>',
'tokenEnd' => '</title>',
'fileSpec' => '\./[^/]+$'
}, {
'groupId' => 'dir',
'level' => 2,
'tokenBegin' => '<title>',
'tokenEnd' => '</title>',
'fileSpec' => '\./[^/]+?/[^/]+$'
}, {
'groupId' => 'dir',
'level' => 3,
'tokenBegin' => '<title>',
'tokenEnd' => '</title>',
'fileSpec' => '\./[^/]+?/[^/]+?/[^/]+$'
}]
});
# Traverse directory structure
find({wanted => \&wanted, no_chdir => 1}, '.');
# Generate ToC of case-insensitively sorted file list
$tocGenerator->extendFromFile(
$toc, [sort {uc($a) cmp uc($b)} @fileList]
);
print $toc->format();
the following ToC will be generated:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href="./index.htm#">Main</a>
<ul>
<li><a href="./SubDir1/index.htm#">Sub1</a>
<ul>
<li><a href="./SubDir1/SubSubDir1/index.htm#">SubSub1</a></li>
</ul>
</li>
<li><a href="./SubDir2/index.htm#">Sub2</a>
<ul>
<li><a href="./SubDir2/SubSubDir1/index.htm#">SubSub1</a></li>
<li><a href="./SubDir2/SubSubDir2/index.htm#">SubSub2</a></li>
</ul>
</li>
<li><a href="./SubDir3/index.htm#">Sub3</a></li>
</ul>
</li>
</ul>
<!-- End of generated Table of Contents -->
Methods
HTML::Toc::clear()
syntax: $toc->clear()
returns: --
Clear the ToC.
HTML::Toc::format()
syntax: $scalar = $toc->format()
returns: Formatted ToC.
Format tokenized ToC.
HTML::TocGenerator::extend()
syntax: $tocGenerator->extend($toc, $string [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to extend
- $string: string to retrieve ToC from
- $options: hash reference containing generator options.
Extend ToC from specified string. For available options, see Parser Options
HTML::TocGenerator::extendFromFile()
syntax: $tocGenerator->extendFromFile($toc, $filename [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to extend
- $filename: (reference to array of) file(s) to extend ToC from
- $options: hash reference containing generator options.
Extend ToC from specified file. For available options, see Parser Options. For an example, see "Extend an existing ToC".
HTML::TocGenerator::generate()
syntax: $tocGenerator->generate($toc, $string [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to generate
- $string: string to retrieve ToC from
- $options: hash reference containing generator options.
Generate ToC from specified string. Before generating, the ToC will be cleared. For extending an existing ToC, use the HTML::TocGenerator::extend() method. For available options, see Parser Options.
HTML::TocGenerator::generateFromFile()
syntax: $tocGenerator->generateFromFile($toc, $filename [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to
generate
- $filename: (reference to array of) file(s) to generate ToC from
- $options: hash reference containing generator options.
Generate ToC from specified file. Before generating, the ToC will be cleared. For extending an extisting ToC, use the HTML::TocGenerator::extendFromFile() method. For available options, see Parser Options.
HTML::TocInsertor::insert()
syntax: $tocInsertor->insert($toc, $string [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to insert
- $string: string to insert ToC in
- $options: hash reference containing insertor options.
Generate ToC from specified string. The string and the generated ToC are printed to standard output. For available options, see Parser Options.
NOTE: To actually insert the ToC in the string, use the output option to specify a scalar reference to insert the ToC into. See Insert into string for an example.
HTML::TocInsertor::insertIntoFile()
syntax: $tocInsertor->insertIntoFile($toc, $filename [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to insert
- $filename: (reference to array of) file(s) to insert ToC in
- $options: hash reference containing insertor options.
Insert ToC into specified file. For available options, see Parser Options.
HTML::TocUpdator::insert()
syntax: $tocUpdator->insert($toc, $string [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to insert
- $string: string to insert ToC in
- $options: hash reference containing updator options.
Insert ToC into specified string. Differs from HTML::TocInsertor::insert() in that inserted text will be surrounded with update tokens in order for HTML::TocUpdator
to be able to update this text the next time an update is issued. See also: Update options.
HTML::TocUpdator::insertIntoFile()
syntax: $tocUpdator->insertIntoFile($toc, $filename [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to insert
- $filename: (reference to array of) file(s) to insert ToC in
- $options: hash reference containing updator options.
Insert ToC into specified file. Differs from HTML::TocInsertor::insert() in that inserted text will be surrounded with update tokens in order for HTML::TocUpdator
to be able to update this text the next time an update is issued. For updator options, see Update options.
HTML::TocUpdator::update()
syntax: $tocUpdator->update($toc, $string [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to insert
- $string: string to update ToC in
- $options: hash reference containing updator options.
Update ToC within specified string. For updator options, see Update options.
HTML::TocUpdator::updateFile()
syntax: $tocUpdator->updateFile($toc, $filename [, $options])
args: - $toc: (reference to array of) HTML::Toc object(s) to insert
- $filename: (reference to array of) file(s) to update ToC in
- $options: hash reference containing updator options.
Update ToC of specified file. For updator options, see Update options.
Parser Options
When generating a ToC, additional options may be specified which influence the way the ToCs are generated using either TocGenerator
, TocInsertor
or TocUpdator
. The options must be specified as a hash reference. For example:
$tocGenerator->generateFromFile($toc, $filename, {doUseGroupsGlobal => 1});
Available options are:
doGenerateToc
syntax: [0|1]
default: 1
applicable to: TocInsertor, TocUpdator
True (1) if ToC must be generated. False (0) if ToC must be inserted only.
doUseGroupsGlobal
syntax: [0|1]
default: 0
applicable to: TocGenerator, TocInsertor, TocUpdator
True (1) if group levels must be used globally accross ToCs. False (0) if not. This option only makes sense when an array of ToCs is specified. For example, suppose you want to generate two ToCs, one ToC for 'h1' tokens and one ToC for 'h2' tokens, of the file 'index.htm':
<h1>Chapter</h1>
<h2>Paragraph</h2>
Using the default setting of 'doUseGroupsGlobal' => 0:
use HTML::Toc;
use HTML::TocGenerator;
my $toc1 = HTML::Toc->new();
my $toc2 = HTML::Toc->new();
my $tocGenerator = HTML::TocGenerator->new();
$toc1->setOptions({
'header' => '',
'footer' => '',
'tokenToToc' => [{'tokenBegin' => '<h1>'}]
});
$toc2->setOptions({
'header' => '',
'footer' => '',
'tokenToToc' => [{'tokenBegin' => '<h2>'}]
});
$tocGenerator->generateFromFile([$toc1, $toc2], 'index.htm');
print $toc1->format() . "\n\n" . $toc2->format();
the output will be:
<ul>
<li><a href=#h-1>Chapter</a>
</ul>
<ul>
<li><a href=#h-1>Paragraph</a>
</ul>
Each ToC will use its own numbering scheme. Now if 'doUseGroupsGlobal = 1
' is specified:
$tocGenerator->generateFromFile(
[$toc1, $toc2], 'index.htm', {'doUseGroupsGlobal' => 1}
);
the output will be:
<ul>
<li><a href=#h-1>Chapter</a>
</ul>
<ul>
<li><a href=#h-2>Paragraph</a>
</ul>
using a global numbering scheme for all ToCs.
output
syntax: reference to scalar
default: none
applicable to: TocInsertor, TocUpdator
Reference to scalar where the output must be stored in.
outputFile
syntax: scalar
default: none
applicable to: TocInsertor, TocUpdator
Filename to write output to. If no filename is specified, output will be written to standard output.
HTML::Toc Options
The HTML::Toc
options can be grouped in the following categories:
The ToC options must be specified using the 'setOptions' method. For example:
my $toc = new HTML::Toc;
$toc->setOptions({
'doNumberToken' => 1,
'footer' => '<!-- End Of ToC -->'
'tokenToToc' => [{
'level' => 1,
'tokenBegin' => '<h1>',
'numberingStyle' => 'lower-alpha'
}]
});
Generate options
Token groups
Numbering tokens
Miscellaneous
Linking ToC to tokens
Insert options
Update options
Format options
HTML::Toc Options Reference
attributeToExcludeToken
syntax: $scalar
default: '-'
Token which marks an attribute value in a tokenBegin or insertionPoint token as an attribute value a token should not have to be marked as a ToC token. See also: Using attribute value as ToC entry.
attributeToTocToken
syntax: $scalar
default: '@'
Token which marks an attribute in a tokenBegin token as an attribute which must be used as ToC text. See also: Using attribute value as ToC entry.
doLinkToToken
syntax: [0|1]
default: 1
True (1) if ToC must be linked to tokens, False (0) if not. Note that 'HTML::TocInsertor' must be used to do the actual insertion of the anchor name within the source data.
doLinkToFile
syntax: [0|1]
default: 0
True (1) if ToC must be linked to file, False (0) if not. In effect only when doLinkToToken equals True (1) and templateAnchorHrefBegin isn't specified.
doLinkToId
syntax: [0|1]
default: 0
True (1) if ToC must be linked to tokens by using token ids. False (0) if ToC must be linked to tokens by using anchor names.
doNestGroup
syntax: [0|1]
default: 0
True (1) if groups must be nested in the formatted ToC, False (0) if not. In effect only when multiple groups are specified within the tokenToToc setting. For an example, see Generate multiple groups in one ToC.
doNumberToken
syntax: [0|1]
default: 0
True (1) if tokens which are used for the ToC generation must be numbered. This option may be specified both as a global ToC option or within a tokenToToc group. When specified within a tokenToToc
option, the doNumberToken
applies to that group only. For an example, see Specify an additional 'Part' group.
doSingleStepLevel
syntax: [0|1]
default: 1
True (1) if levels of a formatted ToC must advance one level at a time. For example, when generating a ToC of a file with a missing 'h2':
<h1>Chapter</h1>
<h3>Paragraph</h3>
By default, an empty indentation level will be inserted in the ToC:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href=#h-1>Header 1</a>
<ul>
<ul>
<li><a href=#h-1.0.1>Header 3</a>
</ul>
</ul>
</ul>
<!-- End of generated Table of Contents -->
After specifying:
$toc->setOptions({'doSingleStepLevel' => 0});
the ToC will not have an indentation level inserted for level 2:
<!-- Table of Contents generated by Perl - HTML::Toc -->
<ul>
<li><a href=#h-1>Header 1</a>
<ul>
<li><a href=#h-1.0.1>Header 3</a>
</ul>
</ul>
<!-- End of generated Table of Contents -->
fileSpec
syntax: <regexp>
default: undef
Specifies which files should match the current level. Valid only if doLinkToFile equals 1. For an example, see Site map.
footer
syntax: $scalar
default: "\n<!-- End of generated Table of Contents -->\n"
String to output at end of ToC.
groupId
syntax: $scalar
default: 'h'
Sets the group id attribute of a tokenGroup. With this attribute it's possible to divide the ToC into multiple groups. Each group has its own numbering scheme. For example, to generate a ToC of both normal headings and 'appendix' headings, specify the following ToC settings:
$toc->setOptions({
'tokenToToc' => [{
'tokenBegin' => '<h1 class=-appendix>'
}, {
'groupId' => 'appendix',
'tokenBegin' => '<h1 class=appendix>'
}]
});
groupToToc
syntax: <regexp>
default: '.*'
Determines which groups to use for generating the ToC. For example, to create a ToC for groups [a-b] only, specify:
'groupToToc => '[a-b]'
This option is evaluated during both ToC generation and ToC formatting. This enables you to generate a ToC of all groups, but - after generating - format only specified groups:
$toc->setOptions({'groupToToc' => '.*'});
$tocGenerator->generateToc($toc, ...);
# Get ToC of all groups
$fullToc = $toc->format();
# Get ToC of 'appendix' group only
$toc->setOptions({'groupToToc' => 'appendix'});
$appendixToc = $toc->format();
header
syntax: $scalar
default: "\n<!-- Table of Contents generated by Perl - HTML::Toc -->\n"
String to output at begin of ToC.
insertionPoint
syntax: [<before|after|replace>] <token>
default: 'after <body>'
token: <[/]tag{ attribute=[-|@]<regexp>}> |
<text regexp> |
<declaration regexp> |
<comment regexp>
Determines the point within the source, where the ToC should be inserted. When specifying a start tag as the insertion point token, attributes to be included may be specified as well. Note that the attribute value must be specified as a regular expression. For example, to specify the <h1 class=header
> tag as insertion point:
'<h1 class=^header$>'
Examples of valid 'insertionPoint' tokens are:
'<h1>'
'</h1>'
'<!-- ToC -->'
'<!ToC>'
'ToC will be placed here'
It is also possible to specify attributes to exclude, by prefixing the value with an attributeToExcludeToken, default a minus sign (-). For example, to specify the <h1
> tag as insertion point, excluding all <h1 class=header
> tags:
'<h1 class=-^header$>'
See also tokenBegin.
level
syntax: number
default: 1
Number which identifies at which level the tokengroup should be incorporated into the ToC. See also: tokenToToc.
levelIndent
syntax: number
default: 3
Sets the number of spaces each level will be indented, when formatting the ToC.
levelToToc
syntax: <regexp>
default: '.*'
Determines which group levels to use for generating the ToC. For example, to create a ToC for levels 1-2 only, specify:
'levelToToc => '[1-2]'
This option is evaluated during both ToC generation and ToC formatting. This enables you to generate a ToC of all levels, but - after generating - retrieve only specified levels:
$toc->setOptions({'levelToToc' => '.*'});
$tocGenerator->generateToc($toc, ...);
# Get ToC of all levels
$fullToc = $toc->getToc();
# Get ToC of level 1 only
$toc->setOptions({'levelToToc' => '1'});
$level1Toc = $toc->getToc();
numberingStyle
syntax: [decimal|lower-alpha|upper-alpha|lower-roman|upper-roman]}
default: decimal
Determines which numbering style to use for a token group when doLinkToToken is set to True (1). When specified as a main ToC option, the setting will be the default for all groups. When specified within a tokengroup, this setting will override any default for that particular tokengroup, e.g.:
$toc->setOptions({
'doNumberToken' => 1,
'tokenToToc' => [{
'level' => 1,
'tokenBegin' => '<h1>',
'numberingStyle' => 'lower-alpha'
}]
});
If roman
style is specified, be sure to have the Roman module installed, available from http://www.perl.com/CPAN/modules/by-module/Roman.
templateAnchorName
syntax: <expression|function reference>
default: '$groupId."-".$node'
Anchor name to use when doLinkToToken is set to True (1). The anchor name is passed to both templateAnchorHrefBegin and templateAnchorNameBegin. The template may be specified as either an expression or a function reference. The expression may contain the following variables:
$file
$groupId
$level
$node
$text E.g. with "<h1><b>Intro</b></h1>", $text = "Intro"
$children Text, including HTML child elements.
E.g. with "<h1><b>Intro</b></h1>", $children = "<b>Intro</b>"
If templateAnchorName
is a function reference to a function returning the anchor, like in:
$toc->setOptions({'templateAnchorName' => \&assembleAnchorName});
the function will be called with the following arguments:
$anchorName = assembleAnchorName($file, $groupId, $level, $node, $text, $children);
templateAnchorHrefBegin
syntax: <expression|function reference>
default: '"<a href=#$anchorName>"' or
'"<a href=$file#$anchorName>"',
depending on 'doLinkToFile' being 0 or 1 respectively.
Anchor reference begin token to use when doLinkToToken is set to True (1). The template may be specified as either an expression or a function reference. The expression may contain the following variables:
$file
$groupId
$level
$node
$anchorName
If templateAnchorHrefBegin
is a function reference to a function returning the anchor, like in:
$toc->setOptions({'templateAnchorHrefBegin' => \&assembleAnchorHrefBegin});
the function will be called with the following arguments:
$anchorHrefBegin = &assembleAnchorHrefBegin(
$file, $groupId, $level, $node, $anchorName
);
See also: templateAnchorName, templateAnchorHrefEnd.
templateAnchorHrefEnd
syntax: <expression|function reference>
default: '"</a>"'
Anchor reference end token to use when doLinkToToken is set to True (1). The template may be specified as either an expression or a function reference. If templateAnchorHrefEnd is a function reference to a function returning the anchor end, like in:
$toc->setOptions({'templateAnchorHrefEnd' => \&assembleAnchorHrefEnd});
the function will be called without arguments:
$anchorHrefEnd = &assembleAnchorHrefEnd;
See also: templateAnchorHrefBegin.
templateAnchorNameBegin
syntax: <expression|function reference>
default: '"<a name=$anchorName>"'
Anchor name begin token to use when doLinkToToken is set to True (1). The template may be specified as either an expression or a function reference. The expression may contain the following variables:
$file
$groupId
$level
$node
$anchorName
If templateAnchorNameBegin
is a function reference to a function returning the anchor name, like in:
$toc->setOptions({'templateAnchorNameBegin' => \&assembleAnchorNameBegin});
the function will be called with the following arguments:
$anchorNameBegin = assembleAnchorNameBegin(
$file, $groupId, $level, $node, $anchorName
);
See also: templateAnchorName, templateAnchorNameEnd.
templateAnchorNameEnd
syntax: <expression|function reference>
default: '"</a>"'
Anchor name end token to use when doLinkToToken is set to True (1). The template may be specified as either an expression or a function reference. If templateAnchorNameEnd is a function reference to a function returning the anchor end, like in:
$toc->setOptions({'templateAnchorNameEnd' => \&assembleAnchorNameEnd});
the function will be called without arguments:
$anchorNameEnd = &assembleAnchorNameEnd;
See also: templateAnchorNameBegin.
templateLevel
syntax: <expression|function reference>
default: '"<li>$text"'
Expression to use when formatting a ToC node. The template may be specified as either an expression or a function reference. The expression may contain the following variables:
$level
$groupId
$node
$sequenceNr
$text
If templateLevel
is a function reference to a function returning the ToC node, like in:
$toc->setOptions({'templateLevel' => \&AssembleTocNode});
the function will be called with the following arguments:
$tocNode = &AssembleTocNode(
$level, $groupId, $node, $sequenceNr, $text
);
templateLevelClose
syntax: <expression|function reference>
default: '"</li>\n"'
Expression to use when formatting a ToC node. The template may be specified as either an expression or a function reference.
templateLevelBegin
syntax: <expression>
default: '"<ul>\n"'
Expression to use when formatting begin of ToC level. See templateLevel for list of available variables to use within the expression. For example, to give each ToC level a class name to use with Cascading Style Sheets (CSS), use the expression:
'"<ul class=\"toc_$groupId$level\">\n"'
which will result in each ToC group given a class name:
<ul class="toc_h1">
<li>Header
</ul>
For an example, see Using CSS for ToC formatting.
templateLevelEnd
syntax: <expression>
default: '"</ul>\n"'
Expression to use when formatting end of ToC level. See templateLevel for a list of available variables to use within the expression. The default expression is:
'"</ul>\n"'
For an example, see Using CSS for ToC formatting.
templateTokenNumber
syntax: <expression|function reference>
default: '"$node "'
Token number to use when doNumberToken equals True (1). The template may be specified as either an expression or a function reference. The expression has access to the following variables:
$file
$groupId
$groupLevel
$level
$node
$toc
If templateTokenNumber
is a function reference to a function returning the token number, like in:
$toc->setOptions({'templateTokenNumber' => \&assembleTokenNumber});
the function will be called with the following arguments:
$number = &assembleTokenNumber(
$node, $groupId, $file, $groupLevel, $level, $toc
);
tokenBegin
syntax: <token>
default: '<h1>'
token: <[/]tag{ attribute=[-|@]<regexp>}> |
<text regexp> |
<declaration regexp> |
<comment regexp>
This scalar defines the token that will trigger text to be put into the ToC. Any start tag, end tag, comment, declaration or text string can be specified. Examples of valid 'tokenBegin' tokens are:
'<h1>'
'</end>'
'<!-- Start ToC entry -->'
'<!Start ToC entry>'
'ToC entry'
When specifying a start tag, attributes to be included may be specified as well. Note that the attribute value is used as a regular expression. For example, to specify the <h1 class=header
> tag as tokenBegin:
'<h1 class=^header$>'
It is also possible to specify attributes to exclude, by prefixing the value with an attributeToExcludeToken, default a minus sign (-). For example, to specify the <h1
> tag as tokenBegin, excluding all <h1 class=header
> tags:
'<h1 class=-^header$>'
Also, you can specify here an attribute value which has to be used as ToC text, by prefixing the value with an attributeToTocToken, default an at sign (@). For example, to use the class value as ToC text:
'<h1 class=@>'
See Generate multiple ToCs for an elaborated example using the attributeToTocToken
to generate a ToC of image alt
attribute values.
See also: tokenEnd, tokenToToc.
tokenEnd
syntax: $scalar
default: empty string ('') or end tag counterpart of 'tokenBegin' if
'tokenBegin' is a start tag
The 'tokenEnd' definition applies to the same rules as tokenBegin.
See also: tokenBegin, tokenToToc.
tokenToToc
syntax: [{array of hashrefs}]
default: [{
'level' => 1,
'tokenBegin' => '<h1>'
}, {
'level' => 2,
'tokenBegin' => '<h2>'
}, {
'level' => 3,
'tokenBegin' => '<h3>'
}, {
'level' => 4,
'tokenBegin' => '<h4>'
}, {
'level' => 5,
'tokenBegin' => '<h5>'
}, {
'level' => 6,
'tokenBegin' => '<h6>'
}]
This hash defines the tokens that must act as ToC entries. Each tokengroup may contain a groupId, level, numberingStyle, tokenBegin and tokenEnd identifier.
tokenUpdateBeginAnchorName
syntax: <string>
default: '<!-- #BeginTocAnchorNameBegin -->';
This token marks the begin of an anchor name, inserted by HTML::TocInsertor
. This option is used by HTML::TocUpdator
.
tokenUpdateEndAnchorName
syntax: <string>
default: '<!-- #EndTocAnchorName -->';
This option is used by HTML::TocUpdator
, to mark the end of an inserted anchor name.
tokenUpdateBeginNumber
syntax: <string>
default: '<!-- #BeginTocNumber -->';
This option is used by HTML::TocUpdator
, to mark the begin of an inserted number.
tokenUpdateEndNumber
syntax: <string>
default: '<!-- #EndTocAnchorName -->';
This option is used by HTML::TocUpdator
, to mark the end of an inserted number.
tokenUpdateBeginToc
syntax: <string>
default: '<!-- #BeginToc -->';
This option is used by HTML::TocUpdator
, to mark the begin of an inserted ToC.
tokenUpdateEndToc
syntax: <string>
default: '<!-- #EndToc -->';
This option is used by HTML::TocUpdator
, to mark the end of an inserted ToC.
Known issues
Cygwin
In order for the test files to run on Cygwin without errors, the 'UNIX' default text file type has to be selected during the Cygwin setup. When extracting the tar.gz file with WinZip the 'TAR file smart CR/LF conversion' has to be turned off via {Options|Configuration...|Miscellaneous} in order for the files 'toc.pod' and './manualTest/manualTest1.htm' to be left in UNIX format.
AUTHOR
Freddy Vulto <"fvulto@gmail.com">
COPYRIGHT
Copyright (c) 2009 Freddy Vulto. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.