NAME
HTML::Clean - The great new HTML::Clean!
VERSION
Version 0.01
SYNOPSIS
Remove unwanted tags from the HTML, but leave the content.
<tag>content</tag>
Exemplo,
use HTML::TreeBuilder::Xpath;
use HTML::Clean;
my $tree = HTML::TreeBuilder::XPath->new_from_content($html);
my $news = $tree->findnodes('//div[@id="news"]')->[0];
my $hc = HTML::Clean->new();
my $clean_news = $hc->clean($news->as_HTML);
print $clean_news;
...
SUBROUTINES/METHODS
new
You inicialize the class and set with tags you want strip.
clean
remove the unwanted tags.
_remove_attrs
Private method which remove all HTML attributes.
Accessors
tags
For default this contain all html tags
in_text
This accessor contain the html tags which you would like to let.
print Dumper $self->in_text;
To see the default value.
AUTHOR
Daniel de Oliveira Mantovani, <daniel.oliveira.mantovani at gmail.com>
BUGS
Please report any bugs or feature requests to bug-html-clean at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HTML-Clean. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc HTML::Clean
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
LICENSE AND COPYRIGHT
Copyright 2011 Daniel de Oliveira Mantovani.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.