NAME

XHTML::Util::Cookbook

Strip all HTML

Destructive

print $xu->strip_tags(join(",", $xu->tags));

Note that this is a destructive action. The tags are gone from the object.

Non-destructive

print $xu->text;

Remember you have access to the underlying XML::LibXML::Document through the doc and root methods. So the above is really just a convenience shortcut for-

print $xu->root->textContent;

This is non-destructive. The tags are still in the object.

Bag it

Strip scripts

Keeping the script content

$xu->strip_tags("script");

Removing tag and its content

$xu->remove("script");

Strip links, leaving text

$xu = XHTML::Util->new(\q{Click <a href="#">here</a>});
print $xu->strip_tags("a");

Strip external (non-relative) links, leaving text

print $xu->strip_tags("a['href^=http']");

Wrap pre content

Long lines in <pre/> tags can wreck layouts or overflow and be unreadable.

Downgrade headers

To do.

Transform text

To do.

Custom tags

To do.

SEE ALSO

XHTML::Util.