NAME

HTML::Clean::Human - html syntax cleaner and reformatter for human beings

DESCRIPTION

This takes html like code and takes out the 'non important' stuff. This is NOT an HTML 2 TXT converter. This is is an html syntax filter/reformatter.

My initial temptation was to simply seek a solution such as html to text. But then I realized html code may have links and other ephemera that would be desireable to keep.

What I want it to get rid of; all the stupid html things such as inline font declarations etc.

This code is useful if you edit html, but you have to do it maybe from already existing html that some whacko wysiwyg junk spat out. Run it through this and voila.

So, if you have to 'remake' a stupid website somebody else made.. this is useful. You can just point it to the url and get the code down.

SYNOPSIS

use HTML::Clean::Human;

my $c = HTML::Clean::Human->new('http://leocharre.com'); # directly from the web!

my $cleaned = $c->clean;

my $original = $c->html_original;

Or use the provided script htmlclean

htmlclean http://leocharre.com > cleaned.html

PROCEDURAL SUBS

Not exported by default. All of these take html string as argument and return filtered.

fix_whitespace()

headings2text()

rip_comments()

rip_fonts()

rip_formatting()

rip_forms()

rip_headers()

rip_javascript()

rip_lists()

rip_styles()

rip_tables()

rip_tag()

BUGS

No doubt, please contact the AUTHOR.

AUTHOR

Leo Charre leocharre at cpan dot org

COPYRIGHT

LICENSE

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, i.e., under the terms of the "Artistic License" or the "GNU General Public License".

DISCLAIMER

This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the "GNU General Public License" for more details.

To install HTML::Clean::Human, copy and paste the appropriate command in to your terminal.

cpanm

cpanm HTML::Clean::Human

CPAN shell

perl -MCPAN -e shell
install HTML::Clean::Human

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)