NAME

Mojo::DOM - Minimalistic HTML5/XML DOM parser with CSS3 selectors

SYNOPSIS

use Mojo::DOM;

# Parse
my $dom = Mojo::DOM->new('<div><p id="a">A</p><p id="b">B</p></div>');

# Find
my $b = $dom->at('#b');
say $b->text;

# Walk
say $dom->div->p->[0]->text;
say $dom->div->children('p')->first->{id};

# Iterate
$dom->find('p[id]')->each(sub { say shift->{id} });

# Loop
for my $e ($dom->find('p[id]')->each) {
  say $e->text;
}

# Modify
$dom->div->p->[1]->append('<p id="c">C</p>');

# Render
say $dom;

DESCRIPTION

Mojo::DOM is a minimalistic and relaxed HTML5/XML DOM parser with CSS3 selector support. It will even try to interpret broken XML, so you should not use it for validation.

CASE SENSITIVITY

Mojo::DOM defaults to HTML5 semantics, that means all tags and attributes are lowercased and selectors need to be lowercase as well.

my $dom = Mojo::DOM->new('<P ID="greeting">Hi!</P>');
say $dom->at('p')->text;
say $dom->p->{id};

If XML processing instructions are found, the parser will automatically switch into XML mode and everything becomes case sensitive.

my $dom = Mojo::DOM->new('<?xml version="1.0"?><P ID="greeting">Hi!</P>');
say $dom->at('P')->text;
say $dom->P->{ID};

XML detection can also be disabled with the xml method.

# Force XML semantics
$dom->xml(1);

# Force HTML5 semantics
$dom->xml(0);

METHODS

Mojo::DOM implements the following methods.

`new`

my $dom = Mojo::DOM->new;
my $dom = Mojo::DOM->new('<foo bar="baz">test</foo>');

Construct a new Mojo::DOM object.

`all_text`

my $trimmed   = $dom->all_text;
my $untrimmed = $dom->all_text(0);

Extract all text content from DOM structure, smart whitespace trimming is enabled by default.

# "foo bar baz"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->all_text;

# "foo\nbarbaz\n"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->all_text(0);

`append`

$dom = $dom->append('<p>Hi!</p>');

Append to element.

# "<div><h1>A</h1><h2>B</h2></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->append('<h2>B</h2>')->root;

`append_content`

$dom = $dom->append_content('<p>Hi!</p>');

Append to element content.

# "<div><h1>AB</h1></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->append_content('B')->root;

`at`

my $result = $dom->at('html title');

Find a single element with CSS3 selectors. All selectors from Mojo::DOM::CSS are supported.

# Find first element with "svg" namespace definition
my $namespace = $dom->at('[xmlns\:svg]')->{'xmlns:svg'};

`attrs`

my $attrs = $dom->attrs;
my $foo   = $dom->attrs('foo');
$dom      = $dom->attrs({foo => 'bar'});
$dom      = $dom->attrs(foo => 'bar');

Element attributes.

`charset`

my $charset = $dom->charset;
$dom        = $dom->charset('UTF-8');

Alias for "charset" in Mojo::DOM::HTML.

`children`

my $collection = $dom->children;
my $collection = $dom->children('div');

Return a Mojo::Collection object containing the children of this element, similar to find.

# Show type of random child element
say $dom->children->shuffle->first->type;

`content_xml`

my $xml = $dom->content_xml;

Render content of this element to XML.

# "<b>test</b>"
$dom->parse('<div><b>test</b></div>')->div->content_xml;

`find`

my $collection = $dom->find('html title');

Find elements with CSS3 selectors and return a Mojo::Collection object. All selectors from Mojo::DOM::CSS are supported.

# Find a specific element and extract information
my $id = $dom->find('div')->[23]{id};

# Extract information from multiple elements
my @headers = $dom->find('h1, h2, h3')->map(sub { shift->text })->each;

`namespace`

my $namespace = $dom->namespace;

Find element namespace.

# Find namespace for an element with namespace prefix
my $namespace = $dom->at('svg > svg\:circle')->namespace;

# Find namespace for an element that may or may not have a namespace prefix
my $namespace = $dom->at('svg > circle')->namespace;

`parent`

my $parent = $dom->parent;

Parent of element.

`parse`

$dom = $dom->parse('<foo bar="baz">test</foo>');

Alias for "parse" in Mojo::DOM::HTML.

# Parse UTF-8 encoded XML
my $dom = Mojo::DOM->new->charset('UTF-8')->xml(1)->parse($xml);

`prepend`

$dom = $dom->prepend('<p>Hi!</p>');

Prepend to element.

# "<div><h1>A</h1><h2>B</h2></div>"
$dom->parse('<div><h2>B</h2></div>')->at('h2')->prepend('<h1>A</h1>')->root;

`prepend_content`

$dom = $dom->prepend_content('<p>Hi!</p>');

Prepend to element content.

# "<div><h2>AB</h2></div>"
$dom->parse('<div><h2>B</h2></div>')->at('h2')->prepend_content('A')->root;

`replace`

my $old = $dom->replace('<div>test</div>');

Replace element.

# "<div><h2>B</h2></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->replace('<h2>B</h2>')->root;

# "<div></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->replace('')->root;

`replace_content`

$dom = $dom->replace_content('test');

Replace element content.

# "<div><h1>B</h1></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->replace_content('B')->root;

# "<div><h1></h1></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->replace_content('')->root;

`root`

my $root = $dom->root;

Find root node.

`text`

my $trimmed   = $dom->text;
my $untrimmed = $dom->text(0);

Extract text content from element only (not including child elements), smart whitespace trimming is enabled by default.

# "foo baz"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->text;

# "foo\nbaz\n"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->text(0);

`text_after`

my $trimmed   = $dom->text_after;
my $untrimmed = $dom->text_after(0);

Extract text content immediately following element, smart whitespace trimming is enabled by default.

# "baz"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->p->text_after;

# "baz\n"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->p->text_after(0);

`text_before`

my $trimmed   = $dom->text_before;
my $untrimmed = $dom->text_before(0);

Extract text content immediately preceding element, smart whitespace trimming is enabled by default.

# "foo"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->p->text_before;

# "foo\n"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->div->p->text_before(0);

`to_xml`

my $xml = $dom->to_xml;

Render this element and its content to XML.

# "<b>test</b>"
$dom->parse('<div><b>test</b></div>')->div->b->to_xml;

`tree`

my $tree = $dom->tree;
$dom     = $dom->tree(['root', [qw(text lalala)]]);

Alias for "tree" in Mojo::DOM::HTML.

`type`

my $type = $dom->type;
$dom     = $dom->type('div');

Element type.

# List types of child elements
$dom->children->each(sub { say $_->type });

`xml`

my $xml = $dom->xml;
$dom    = $dom->xml(1);

Alias for "xml" in Mojo::DOM::HTML.

CHILD ELEMENTS

In addition to the methods above, many child elements are also automatically available as object methods, which return a Mojo::DOM or Mojo::Collection object, depending on number of children.

say $dom->p->text;
say $dom->div->[23]->text;
$dom->div->each(sub { say $_->text });

ELEMENT ATTRIBUTES

Direct hash reference access to element attributes is also possible.

say $dom->{foo};
say $dom->div->{id};

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

DESCRIPTION

CASE SENSITIVITY

METHODS

new

all_text

append

append_content

at

attrs

charset

children

content_xml

find

namespace

parent

parse

prepend

prepend_content

replace

replace_content

root

text

text_after

text_before

to_xml

tree

type

xml