NAME

Formatter::HTML::HTML - Formatter to clean existing HTML

SYNOPSIS

use Formatter::HTML::HTML;
my $formatter = Formatter::HTML::HTML->format($data);
print $formatter->document;
print $formatter->title;
my $links = $text->links;
print ${$links}[0]->{uri};

DESCRIPTION

This module will clean the document using HTML::Tidy. It also inherits from that module, so you can use methods of that class. It can also parse and return links and the title (using HTML::TokeParser).

METHODS

This module conforms with the Formatter API specification, version 0.93:

format($string): The format function that you call to initialise the formatter. It takes the plain text as a string argument and returns a an object of this class.
document([$charset]): Will return a full, cleaned and valid HTML document. You may specify an optional $charset parameter. This will include a HTML meta element with the chosen character set. It will still be your responsibility to ensure that the document served is encoded with this character set.
fragment: This will return only the contents of the body element.
links: Will return all links found the input plain text string as an arrayref. The arrayref will for each element contain a key uri with the address and title with the link text.
title: Will return the title of the document as seen in the HTML title element or undef if none can be found.

TODO

Both the fragment and document methods use naive regular expressions to strip off elements and add a meta element respectively. This is clearly not very reliable, and should be done with a proper parser.

AUTHOR

Kjetil Kjernsmo, <kjetilk@cpan.org>

COPYRIGHT AND LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.

To install Formatter::HTML::HTML, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Formatter::HTML::HTML

CPAN shell

perl -MCPAN -e shell
install Formatter::HTML::HTML

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)