NAME
HTML::RelExtor - Extract "rel" and "rev" information from LINK and A tags.
SYNOPSIS
use HTML::RelExtor;
my $parser = HTML::RelExtor->new();
$parser->parse($html);
for my $link ($parser->links) {
print $link->href, "\n" if $link->has_rel('nofollow');
}
my($canonical) = grep $_->has_rev('canonical'), $parser->links;
if ($canonical) {
$shorten_url = $canonical->href;
}
DESCRIPTION
HTML::RelExtor is a HTML parser module to extract relationship information from A
and LINK HTML tags.
METHODS
- new
-
$parser = HTML::RelExtor->new(); $parser = HTML::RelExtor->new(base => $base_uri);
Creates new HTML::RelExtor object.
- parse
-
$parser->parse($html);
Parses HTML content. See HTML::Parser for other method signatures.
- links
-
my @links = $parser->links(); my @links = $parser->links(rel => 'alternate'); my @links = $parser->links(rev => 'canonical');
Returns list of link information with 'rel' or 'rev' attributes as a HTML::RelExtor::Link object. When given rel or rev parameter, returns only links that has the rel or rev value.
# These are equivalent @links = $parser->links(rel => 'alternate'); @links = grep $_->has_rel('alternate'), $parser->links;
HTML::RelExtor::Link METHODS
- href
-
my $href = $link->href;
Returns 'href' attribute of links.
- tag
-
my $tag = $link->tag;
Returns tag name of links in lowercase, either 'a' or 'link';
- attr
-
my $attr = $link->attr;
Returns a hash reference of attributes of the tag.
- rel
-
my @rel = $link->rel;
Returns list of 'rel' attributes. If a link contains
<a href="tag nofollow">blahblah</a>
,rel()
method returns a list that containstag
andnofollow
. - rev
-
my @rev = $link->rev;
Returns list of 'rev' attributes.
- has_rel
-
if ($link->has_rel('nofollow')) { }
A handy shortcut method to find out if a link contains specific relationship.
- has_rev
-
if ($link->has_rev('canonical')) { }
A handy shortcut method to find out if a link contains specific reverse relationship.
- text
-
my $text = $link->text;
Returns text inside tags, only avaiable with A tags. It returns undef value when called with LINK tags.
EXAMPLES
Collect A links tagged with rel="friend"
used in XFN (XHTML Friend Network).
my $p = HTML::RelExtor->new();
$p->parse($html);
my @links = map { $_->href }
grep { $_->tag eq 'a' && $_->has_rel('friend') } $p->links;
TODO
Accept callback parameter when creating a new instance.
AUTHOR
Tatsuhiko Miyagawa <miyagawa at bulknews.net>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SEE ALSO
http://www.w3.org/TR/REC-html40/struct/links.html
http://www.google.com/googleblog/2005/01/preventing-comment-spam.html
http://developers.technorati.com/wiki/RelTag
http://shiflett.org/blog/2009/apr/save-the-internet-with-rev-canonical