Name
Data::Edit::Xml - Edit data held in xml format
Synopsis
Transform some DocBook xml into Dita:
use Data::Edit::Xml;
# Docbook
say STDERR Data::Edit::Xml::new(<<END)
<sli>
<li>
<p>Diagnose the problem</p>
<p>This can be quite difficult</p>
<p>Sometimes impossible</p>
</li>
<li>
<p><pre>ls -la</pre></p>
<p><pre>
drwxr-xr-x 2 phil phil 4096 Jun 15 2016 Desktop
drwxr-xr-x 2 phil phil 4096 Nov 9 20:26 Downloads
</pre></p>
</li>
</sli>
END
# Transform to Dita
->by(sub
{my ($o, $p) = @_;
if ($o->at(qw(pre p li sli)) and $o->isOnlyChild)
{$o->change($p->isFirst ? qw(cmd) : qw(stepresult));
$p->unwrap;
}
elsif ($o->at(qw(li sli)) and $o->over(qr(\Ap( p)+\Z)))
{$_->change($_->isFirst ? qw(cmd) : qw(info)) for $o->contents;
}
})
->by(sub
{my ($o) = @_;
$o->change(qw(step)) if $o->at(qw(li sli));
$o->change(qw(steps)) if $o->at(qw(sli));
$o->id = 's'.($o->position+1) if $o->at(qw(step));
$o->id = 'i'.($o->index+1) if $o->at(qw(info));
$o->wrapWith(qw(screen)) if $o->at(qw(CDATA stepresult));
})
# Print
->prettyString;
Produces:
<steps>
<step id="s1">
<cmd>Diagnose the problem</cmd>
<info id="i1">This can be quite difficult</info>
<info id="i2">Sometimes impossible</info>
</step>
<step id="s2">
<cmd>ls -la</cmd>
<stepresult>
<screen>
drwxr-xr-x 2 phil phil 4096 Jun 15 2016 Desktop
drwxr-xr-x 2 phil phil 4096 Nov 9 20:26 Downloads
</screen>
</stepresult>
</step>
</steps>
Description
Construction
Create a parse tree
File or String
S Construct a parse tree from a file or a string
new
New parse - call this method statically as in Data::Edit::Xml::new(file or string) or with no parameters and then use "input", "inputFile", "inputString", "errorFile" to provide specific parameters for the parse, then call "parse" to perform the parse and return the parse tree
1 $fileNameOrString File name or string
attributes :lvalue
The attributes of this node, see also: "Attributes". The frequently used attributes: class, id, href, outputclass can be accessed by an lvalue method as in: $node->id = 'c1'
conditions :lvalue
Conditional strings attached to a node, see "Conditions"
content :lvalue
Content of command: the nodes immediately below this node in the order in which they appeared in the source text, see also "Contents"
indexes :lvalue
Indexes to sub commands by tag in the order in which they appeared in the source text
labels :lvalue
The labels attached to a node to provide addressability from other nodes, see: "Labels".
errorsFile :lvalue
Error listing file. Use this parameter to explicitly set the name of the file that will be used to write an parse errors to, by default this file is named: zzzParseErrors/out.data
inputFile :lvalue
Source file of the parse if this is the parser node. Use this parameter to explicitly set the file to be parsed.
input :lvalue
Source of the parse if this is the parser node. Use this parameter to specify some input either as a string or as a file name for the parser to convert into a parse tree
inputString :lvalue
Source string of the parse if this is the parser node. Use this parameter to explicitly set the string to be parsed.
parent :lvalue
Parent node of this node or undef if the root node. See also "Traversal" and "Navigation". Consider as read only.
parser :lvalue
Parser details: the root node of a tree is the parse node for that tree. Consider as read only.
tag :lvalue
Tag name for this node, see also "Traversal" and "Navigation". Consider as read only.
text :lvalue
Text of this node but only if it is a text node, i.e. the tag is cdata() <=> "isText" is true
cdata
The name of the tag to be used to represent text - this tag must not also be used as a command tag otherwise chaos will occur
parse
Parse input xml
1 $p Parser created by L</new>
tree
Build a tree representation of the parsed xml which can be easily traversed to look for things
1 $parent The parent node
2 $parse The remaining parse
This is a private method.
Node by Node
Construct a parse tree node by node
newText
Create a new text node
1 undef Any reference to this package
2 $text Content of new text node
newTag
Create a new non text node
1 undef Any reference to this package
2 $command The tag for the node
3 %attributes Attributes as a hash
newTree
Create a new tree - this is a static method
1 $command The name of the root node in the tree
2 %attributes Attributes of the root node in the tree as a hash
disconnectLeafNode
Remove a leaf node from the parse tree and make it into its own parse tree
1 $node Leaf node to disconnect
This is a private method.
indexNode
Index the children of a node so that we can access them by tag and number
1 $node Node to index
This is a private method.
replaceSpecialChars
< > " with < > " Larry Wall's excellent Xml parser unfortunately replaces < > " & etc. with their expansions in text by default and does not seem to provide an obvious way to stop this behavior, so we have to put them back gain using this method. Worse, we cannot decide whether to replace & with & or leave it as is: consequently you might have to examine the instances of & in your output text and guess based on the context.
1 $string String to be edited
tags
Count the number of tags in a parse tree
1 $node Parse tree
renew
Returns a renewed copy of the parse tree: use this method if you have added nodes via the "Put as text" methods and wish to reprocess them
1 $node Parse tree
clone
Return a clone of the parse tree: use this method if you want to make temporary changes to a parse tree
1 $node Parse tree
equals
Decide whether two parse trees are equal or not
1 $node1 Parse tree 1
2 $node2 Parse tree 2
save
Save a copy of the parse tree to a file which can be restored and return the saved node
1 $node Parse tree
2 $file File
restore
Return a parse tree from a copy saved in a file by "save" - this is a static method so call it as Data::Edit::Xml::lint(file name)
1 $file File
Stringification
Create a string representation of the parse tree with optional selection of nodes via conditions
Print the parse tree
string
Return a string representing a node of a parse tree and all the nodes below it
1 $node Start node
stringQuoted
Return a quoted string representing a parse tree a node of a parse tree and all the nodes below it
1 $node Start node
stringReplacingIdWithLabels
Return a string representing a node of a parse tree with all the id attributes replaced with the labels attached to each node
1 $node Start node
stringReplacingIdWithLabelsQuoted
Return a quoted string representing a node of a parse tree with all the id attributes replaced with the labels attached to each node
1 $node Start node
contentString
Return a string representing all the nodes below a node of a parse tree
1 $node Start node
prettyString
Return a readable string representing a node of a parse tree and all the nodes below it
1 $node Start node
2 $depth Depth
prettyStringShowingCDATA
Return a readable string representing a node of a parse tree and all the nodes below it with the text fields wrapped with <CDATA>...</CDATA>
1 $node Start node
2 $depth Depth
PrettyContentString
Return a readable string representing all the nodes below a node of a parse tree - infrequent use and so capitalized to avoid being presented as an option by Geany
1 $node Start node
Conditions
Print a subset of the the parse tree determined by the conditions attached to it
stringWithConditions
Return a string representing a node of a parse tree and all the nodes below it subject to conditions to select or reject some nodes
1 $node Start node
2 @conditions Conditions to be regarded as in effect
addConditions
Add conditions to a node and return the node
1 $node Node
2 @conditions Conditions to add
deleteConditions
Delete conditions applied to a node and return the node
1 $node Node
2 @conditions Conditions to add
listConditions
Return a list of conditions applied to a node
1 $node Node
Attributes
Get or set attributes
attr :lvalue
Return the value of an attribute of the current node as an assignable value
1 $node Node in parse tree
2 $attribute Attribute name
attrs
Return the values of the specified attributes of the current node
1 $node Node in parse tree
2 @attributes Attribute names
attrCount
Return the number of attributes in the specified node
1 $node Node in parse tree
setAttr
Set the value of an attribute in a node and return the node
1 $node Node in parse tree
2 %values (attribute name=>new value)*
deleteAttr
Delete the attribute, optionally checking its value first and return the node
1 $node Node
2 $attr Attribute name
3 $value Optional attribute value to check first
deleteAttrs
Delete any attributes mentioned in a list without checking their values and return the node
1 $node Node
2 @attrs Attribute name
renameAttr
Change the name of an attribute regardless of whether the new attribute already exists and return the node
1 $node Node
2 $old Existing attribute name
3 $new New attribute name
changeAttr
Change the name of an attribute unless it has already been set and return the node
1 $node Node
2 $old Existing attribute name
3 $new New attribute name
renameAttrValue
Change the name and value of an attribute regardless of whether the new attribute already exists and return the node
1 $node Node
2 $old Existing attribute name
3 $oldValue Existing attribute value
4 $new New attribute name
5 $newValue New attribute value
changeAttrValue
Change the name and value of an attribute unless it has already been set and return the node
1 $node Node
2 $old Existing attribute name
3 $oldValue Existing attribute value
4 $new New attribute name
5 $newValue New attribute value
Traversal
Traverse the parse tree
by
Post-order traversal of a parse tree or sub tree and return the specified starting node
1 $node Starting node
2 $sub Sub to call for each sub node
3 @context Accumulated context
byReverse
Reverse post-order traversal of a parse tree or sub tree and return the specified starting node
1 $node Starting node
2 $sub Sub to call for each sub node
3 @context Accumulated context
down
Pre-order traversal down through a parse tree or sub tree and return the specified starting node
1 $node Starting node
2 $sub Sub to call for each sub node
3 @context Accumulated context
downReverse
Reverse pre-order traversal down through a parse tree or sub tree and return the specified starting node
1 $node Starting node
2 $sub Sub to call for each sub node
3 @context Accumulated context
through
Traverse parse tree visiting each node twice and return the specified starting node
1 $node Starting node
2 $before Sub to call when we meet a node
3 $after Sub to call we leave a node
4 @context Accumulated context
Contents
Contents of the specified node
contents
Return all the nodes contained by this node either as an array or as a reference to such an array
1 $node Node
contentBeyond
Return all the nodes following this node at the level of this node
1 $node Node
contentBefore
Return all the nodes preceding this node at the level of this node
1 $node Node
contentAsTags
Return a string containing the tags of all the nodes contained by this node separated by single spaces
1 $node Node
contentBeyondAsTags
Return a string containing the tags of all the nodes following this node separated by single spaces
1 $node Node
contentBeforeAsTags
# Return a string containing the tags of all the nodes preceding this node separated by single spaces
1 $node Node
position
Return the index of a node in its parent's content
1 $node Node
index
Return the index of a node in its parent index
1 $node Node
present
Return the count of the number of the specified tag types present immediately under a node
1 $node Node
2 @names Possible tags immediately under the node
count
Return the count the number of instances of the specified tags under the specified node, either by tag in array context or in total in scalar context
1 $node Node
2 @names Possible tags immediately under the node
isText
Confirm that this is a text node
1 $node Node to test
isBlankText
Confirm that this is a text node and that it is blank
1 $node Node to test
Navigation
Move around in the parse tree
get
Return a sub node under the specified node as directed by the search specification: (index position?)* where index is the kind of tag to be chosen and position is the optional position within the index. Position defaults to zero if not specified. Position can also be negative to index back from the top of the index array.
1 $node Node
2 @position Search specification
Example:
$a->get(qw a b -1))
would get the last b node under the first a node if such a node exists.
Use getX to execute get but die 'get' instead of returning undef
c
Return an array of all the nodes with the specified tag below the specified node
1 $node Node
2 $tag Tag
first
Return the first node below this node
1 $node Node
Use firstNonBlank to skip a (rare) initial blank text CDATA. Use firstNonBlankX to die rather then receive a returned undef result.
firstIn
Return the first node matching one of the named tags under the specified node
1 $node Node
2 @tags Tags to search for
Use firstInX to execute firstIn but die 'firstIn' instead of returning undef
firstContextOf
Return the first node encountered in the specified context in a depth first post-order traversal of the parse tree
1 $node Node
2 @context Array of tags specifying context
Use firstContextOfX to execute firstContextOf but die 'firstContextOf' instead of returning undef
last
Return the last node below this node
1 $node Node
Use lastNonBlank to skip a (rare) initial blank text CDATA. Use lastNonBlankX to die rather then receive a returned undef result.
lastIn
Return the first node matching one of the named tags under the specified node
1 $node Node
2 @tags Tags to search for
Use lastInX to execute lastIn but die 'lastIn' instead of returning undef
lastContextOf
Return the last node encountered in the specified context in a depth first reverse pre-order traversal of the parse tree
1 $node Node
2 @context Array of tags specifying context
Use lastContextOfX to execute lastContextOf but die 'lastContextOf' instead of returning undef
next
Return the node next to the specified node
1 $node Node
Use nextNonBlank to skip a (rare) initial blank text CDATA. Use nextNonBlankX to die rather then receive a returned undef result.
nextIn
Return the next node matching one of the named tags
1 $node Node
2 @tags Tags to search for
Use nextInX to execute nextIn but die 'nextIn' instead of returning undef
prev
Return the node previous to the specified node
1 $node Node
Use prevNonBlank to skip a (rare) initial blank text CDATA. Use prevNonBlankX to die rather then receive a returned undef result.
prevIn
Return the next previous node matching one of the named tags
1 $node Node
2 @tags Tags to search for
Use prevInX to execute prevIn but die 'prevIn' instead of returning undef
upto
Return the first ancestral node that matches the specified context
1 $node Start node
2 @tags Tags identifying context
Use uptoX to execute upto but die 'upto' instead of returning undef
Position
at
Confirm that the node has the specified ancestry
1 $node Starting node
2 @context Ancestry
context
Return a string containing the tag of this node and its ancestors separated by single spaces
1 $node Node
isFirst
Confirm that this node is the first node under its parent
1 $node Node
isLast
Confirm that this node is the last node under its parent
1 $node Node
isOnlyChild
Confirm that this node is the only node under its parent
1 $node Node
isEmpty
Confirm that this node is empty, that is: this node has no content, not even a blank string of text
1 $node Node
over
Confirm that the string representing the tags at the level below this node match a regular expression
1 $node Node
2 $re Regular expression
after
Confirm that the string representing the tags following this node match a regular expression
1 $node Node
2 $re Regular expression
before
Confirm that the string representing the tags preceding this node match a regular expression
1 $node Node
2 $re Regular expression
Editing
Edit the data in the parse tree
Structure
Change the structure of the parse tree
change
Change the name of a node in an optional tag context and return the node
1 $node Node
2 $name New name
3 @tags Tags defining the context
Use changeX to execute change but die 'change' instead of returning undef
Wrap and unwrap
wrapWith
Wrap the original node in a new node forcing the original node down deepening the parse tree; return the new wrapping node
1 $old Node
2 $tag Tag for new node
3 %attributes Attributes for new node
wrapUp
Wrap the original node in a sequence of new nodes forcing the original node down deepening the parse tree; return the array of wrapping nodes
1 $node Node to wrap
2 @tags Tags to wrap the node with - with the uppermost tag rightmost
wrapDown
Wrap the content of the original node in a sequence of new nodes forcing the original node up deepening the parse tree; return the array of wrapping nodes
1 $node Node to wrap
2 @tags Tags to wrap the node with - with the uppermost tag rightmost
wrapContentWith
Wrap the content of a node in a new node, the original content then contains the new node which contains the original node's content; returns the new wrapped node
1 $old Node
2 $tag Tag for new node
3 %attributes Attributes for new node
unwrap
Unwrap a node by inserting its content into its parent at the point containing the node; returns the parent node
1 $node Node to unwrap
Replace
replaceWith
Replace a node (and all its content) with a new node (and all its content) and return the new node
1 $old Old node
2 $new New node
replaceWithText
Replace a node (and all its content) with a new text node and return the new node
1 $old Old node
2 $text Text of new node
replaceWithBlank
Replace a node (and all its content) with a new blank text node and return the new node
1 $old Old node
Cut and Put
Move nodes around in the parse tree
cut
Cut out a node - remove the node from the parse tree and return the node so that it can be put else where
1 $node Node to cut out
putFirst
Place the new node at the front of the content of the original node and return the new node
1 $old Original node
2 $new New node
putLast
Place the new node at the end of the content of the original node and return the new node
1 $old Original node
2 $new New node
putNext
Place the new node just after the original node in the content of the parent and return the new node
1 $old Original node
2 $new New node
putPrev
Place the new node just before the original node in the content of the parent and return the new node
1 $old Original node
2 $new New node
Split a node
Split the content of a node by moving nodes to preceding or following nodes to a preceding or following node
concatenate
Concatenate two successive nodes and return the target node
1 $target Target node to replace
2 $source Node to concatenate
concatenateSiblings
Concatenate preceding and following nodes that have the same tag as the specified node and return the specified node
1 $node Concatenate around this node
splitBack
Move the specified node and all its preceding nodes to a newly created node preceding this node's parent and return the new node (mm July 31, 2017)
1 $old Move this node and its preceding nodes
2 $new The name of the new node
splitBackEx
Move all the nodes preceding a specified node to a newly created node preceding this node's parent and return the new node
1 $old Move all the nodes preceding this node
2 $new The name of the new node
splitForwards
Move the specified node and all its following nodes to a newly created node following this node's parent and return the new node
1 $old Move this node and its following nodes
2 $new The name of the new node
splitForwardsEx
Move all the nodes following a node to a newly created node following this node's parent and return the new node
1 $old Move the nodes following this node
2 $new The name of the new node
Put as text
Add text to the parse tree
putFirstAsText
Add a new text node first under a parent and return the new text node
1 $node The parent node
2 $text The string to be added which might contain unparsed Xml as well as text
putLastAsText
Add a new text node last under a parent and return the new text node
1 $node The parent node
2 $text The string to be added which might contain unparsed Xml as well as text
putNextAsText
Add a new text node following this node and return the new text node
1 $node The parent node
2 $text The string to be added which might contain unparsed Xml as well as text
putPrevAsText
Add a new text node following this node and return the new text node
1 $node The parent node
2 $text The string to be added which might contain unparsed Xml as well as text
Labels
Additional labels for a node which will be recognized by Data::Edit::Xml::Lint
addLabels
Add the named labels to the specified node and return that node
1 $node Node in parse tree
2 @labels Names of labels to add
countLabels
Return the count of the number of labels at a node
1 $node Node in parse tree
getLabels
Return the names of all the labels set on a node
1 $node Node in parse tree
deleteLabels
Delete the specified labels in the specified node and return that node
1 $node Node in parse tree
2 @labels Names of the labels to be deleted
deleteAllLabels
Delete all the labels in the specified node and return that node
1 $node Node in parse tree
copyLabels
Copy all the labels from the source node to the target node and return the source node
1 $source Source node
2 $target Target node
moveLabels
Move all the labels from the source node to the target node and return the source node
1 $source Source node
2 $target Target node
Operators
Operator access to methods use the assign versions to avoid error messages about pointless expression in a void context. Use the non assign versions to return the results of the underlying method call. Thus '/' returns the wrapping node, whilst '/=' does not.
opString
-c : clone, -p : pretty string, -r : renew, -s : string, -t : tag.
1 $node Node
2 $op Monadic operator
Example:
-p $x
to print node $x as a pretty string
opContents
@{} : content of a node.
1 $node Node
Example:
grep {...} @$x
to search the contents of node $x
opOut
>>= : Write a parse tree out on a file.
1 $node Node
2 $file File
Example:
$x >>= *STDERR
opContext
<= : Check that a node is in the context specified by the referenced array of words.
1 $node Node
2 $context Reference to array of words specifying the parents of the desired node
Example:
$c <= [qw(c b a)]
to confirm that node $c has tag 'c', parent 'b' and grand parent 'a'
opPutFirst
+ or += : put a node or string first under a node.
1 $node Node
2 $text Node or text to place first under the node
Example:
my $f = $a + '<p>first</p>'
opPutLast
- : put a node or string last under a node.
1 $node Node
2 $text Node or text to place last under the node
Example:
my $l = $a + '<p>last</p>'
opPutNext
> : put a node or string after the current node.
1 $node Node
2 $text Node or text to place after the first node
Example:
my $n = $a > '<p>next</p>'
opPutPrev
< : put a node or string before the current node,
1 $node Node
2 $text Node or text to place before the first node
Example:
my $p = $a < '<p>next</p>'
opBy
x= : Traverse a parse tree in pre-order.
1 $node Parse tree
2 $code Code to execute against each node
Example:
$a x= sub {say -s $_}
to print all the parse trees in a parse tree
opGet
>> : Search for a node via a specification provided as a reference to an array of words each number. Each word represents a tag name, each number the index of the previous tag or zero by default.
1 $node Node
2 $get Reference to an array of search parameters
Example:
my $f = $a >> [qw(aa 1 bb)]
to find the first bb under the second aa under $a
opAttr
% : Get the value of an attribute of this node.
1 $node Node
2 $attr Reference to an array of words and numbers specifying the node to search for.
Example:
my $a = $x % 'href'
to get the href attribute of the node at $x
opSetTag
+= : Set the tag for a node.
1 $node Node
2 $tag Tag
Example:
$a += 'tag'
to change the tag to 'tag' at the node $a
opSetId
-= : Set the id for a node.
1 $node Node
2 $id Id
Example:
$a -= 'id'
to change the id to 'id' at node $a
opWrapWith
/ or /= : Wrap node with a tag, returning or not returning the wrapping node.
1 $node Node
2 $tag Tag
Example:
$x /= 'aa'
to wrap node $x with a node with a tag of 'aa'
opWrapContentWith
* or *= : Wrap content with a tag, returning or not returning the wrapping node.
1 $node Node
2 $tag Tag
Example:
$x *= 'aa'
to wrap the content of node $x with a node with a tag of 'aa'
opCut
-- : Cut out a node.
1 $node Node
Example:
--$x
to cut out the node $x
opUnWrap
++ : Unwrap a node.
1 $node Node
Example:
++$x
to unwrap the node $x
Debug
Debugging methods
printAttributes
Print the attributes of a node
1 $node Node whose attributes are to be printed
printAttributesReplacingIdsWithLabels
Print the attributes of a node replacing the id with the labels
1 $node Node whose attributes are to be printed
checkParentage
Check the parent pointers are correct in a parse tree
1 $x Parse tree
This is a private method.
checkParser
Check that every node has a parser
1 $x Parse tree
This is a private method.
nn
Replace new line with N
1 $s String
This is a private method.
Index
printAttributesReplacingIdsWithLabels
stringReplacingIdWithLabelsQuoted
Installation
This module is written in 100% Pure Perl and, thus, it is easy to read, use, modify and install.
Standard Module::Build process for building and installing modules:
perl Build.PL
./Build
./Build test
./Build install
Author
Copyright
Copyright (c) 2016-2017 Philip R Brenan.
This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.