Changes for version 5.900 - 2012-12-15
- Trial Release by Christopher J. Madsen
- THINGS THAT MAY BREAK YOUR CODE OR TESTS
- parse_file (and new_from_file) now try to determine the encoding automatically when given a filename (not a filehandle). To restore the old behavior, set the new encoding attribute to the empty string. To restore it globally, set $HTML::Element::default_encoding = ''.
- ENHANCEMENTS
- new_from_file & new_from_url now let you set parsing attributes
- New shortcut constructor new_from_string is like new_from_content, but allows you to set parsing attributes
- New shortcut constructor new_from_http for constructing a tree from the content of a HTTP::Message (or subclass like HTTP::Response)
- Setting the new self_closed_tags attribute to 1 makes TreeBuilder handle XML-style self-closed tags (e.g. <a id="a1" />)
- New child_nodes method makes for simpler recursion
- New openw and encode_fh methods for writing a file with the correct encoding
- DOCUMENTATION
- new actually does take optional attributes (It has since at least 3.18, although undocumented, and it did not previously work with ignore_ignorable_whitespace.)
- methods & attributes added in version 4.0 or later are now marked
- don't recommend the traverse method; give recursive example (RT #48344)
- TESTS
- Add test for self_closed_tags attribute.
- Clarify skip message in construct_tree.t (RT #81371)
- THINGS THAT MAY BREAK YOUR CODE OR TESTS
Documentation
article: "User's View of Object-Oriented Modules"
article on tree-shaped data structures in Perl
article: "Scanning HTML"
Modules
functions that construct a HTML syntax tree
Class for objects that represent HTML elements
discussion of HTML::Element's traverse method
Deprecated, a wrapper around HTML::TreeBuilder
build and scan parse-trees of HTML
Parser that builds a HTML syntax tree