Name

Data::Edit::Xml - Edit data held in xml format

Synopsis

Transform some DocBook xml into Dita:

use Data::Edit::Xml;

# Docbook

say STDERR Data::Edit::Xml::new(<<END)
<sli>
 <li>
   <p>Diagnose the problem</p>
   <p>This can be quite difficult</p>
   <p>Sometimes impossible</p>
 </li>
 <li>
 <p><pre>ls -la</pre></p>
 <p><pre>
drwxr-xr-x  2 phil phil   4096 Jun 15  2016 Desktop
drwxr-xr-x  2 phil phil   4096 Nov  9 20:26 Downloads
</pre></p>
 </li>
</sli>
END

# Transform to Dita

->by(sub
 {my ($o, $p) = @_;
  if ($o->at(qw(pre p li sli)) and $o->isOnlyChild)
   {$o->change($p->isFirst ? qw(cmd) : qw(stepresult));
    $p->unwrap;
   }
  elsif ($o->at(qw(li sli))    and $o->over(qr(\Ap( p)+\Z)))
   {$_->change($_->isFirst ? qw(cmd) : qw(info)) for $o->contents;
   }
 })

 ->by(sub
 {my ($o) = @_;
  $o->change(qw(step))          if $o->at(qw(li sli));
  $o->change(qw(steps))         if $o->at(qw(sli));
  $o->id = 's'.($o->position+1) if $o->at(qw(step));
  $o->id = 'i'.($o->index+1)    if $o->at(qw(info));
  $o->wrapWith(qw(screen))      if $o->at(qw(CDATA stepresult));
 })

 # Print
 ->prettyString;

Produces:

<steps>
  <step id="s1">
    <cmd>Diagnose the problem</cmd>
    <info id="i1">This can be quite difficult</info>
    <info id="i2">Sometimes impossible</info>
    </step>
  <step id="s2">
    <cmd>ls -la</cmd>
    <stepresult>
      <screen>
drwxr-xr-x  2 phil phil   4096 Jun 15  2016 Desktop
drwxr-xr-x  2 phil phil   4096 Nov  9 20:26 Downloads
      </screen>
    </stepresult>
  </step>
</steps>

Description

Construction

Create a parse tree

File or String

Construct a parse tree from a file or a string

new

New parse - call this method statically as in Data::Edit::Xml::new(file or string) or with no parameters and then use "input", "inputFile", "inputString", "errorFile" to provide specific parameters for the parse, then call "parse" to perform the parse and return the parse tree

1  $fileNameOrString  File name or string

attributes :lvalue

The attributes of this node, see also: "Attributes". The frequently used attributes: class, id, href, outputclass can be accessed by an lvalue method as in: $node->id = 'c1'

conditions :lvalue

Conditional strings attached to a node, see "Conditions"

content :lvalue

Content of command: the nodes immediately below this node in the order in which they appeared in the source text, see also "Contents"

indexes :lvalue

Indexes to sub commands by tag in the order in which they appeared in the source text

labels :lvalue

The labels attached to a node to provide addressability from other nodes, see: "Labels".

errorsFile :lvalue

Error listing file. Use this parameter to explicitly set the name of the file that will be used to write an parse errors to, by default this file is named: zzzParseErrors/out.data

inputFile :lvalue

Source file of the parse if this is the parser node. Use this parameter to explicitly set the file to be parsed.

input :lvalue

Source of the parse if this is the parser node. Use this parameter to specify some input either as a string or as a file name for the parser to convert into a parse tree

inputString :lvalue

Source string of the parse if this is the parser node. Use this parameter to explicitly set the string to be parsed.

parent :lvalue

Parent node of this node or undef if the root node. See also "Traversal" and "Navigation". Consider as read only.

parser :lvalue

Parser details: the root node of a tree is the parse node for that tree. Consider as read only.

tag :lvalue

Tag name for this node, see also "Traversal" and "Navigation". Consider as read only.

text :lvalue

Text of this node but only if it is a text node, i.e. the tag is cdata() <=> "isText" is true

cdata

The name of the tag to be used to represent text - this tag must not also be used as a command tag otherwise chaos will occur

parse

Parse input xml

1  $p  Parser created by L</new>

Node by Node

Construct a parse tree node by node

newText

Create a new text node

1  undef  Any reference to this package
2  $text  Content of new text node

newTag

Create a new non text node

1  undef        Any reference to this package
2  $command     The tag for the node
3  %attributes  Attributes as a hash

newTree

Create a new tree - this is a static method

1  $command     The name of the root node in the tree
2  %attributes  Attributes of the root node in the tree as a hash

replaceSpecialChars

< > " with &lt; &gt; &quot; Larry Wall's excellent Xml parser unfortunately replaces &lt; &gt; &quot; &amp; etc. with their expansions in text by default and does not seem to provide an obvious way to stop this behavior, so we have to put them back gain using this method. Worse, we cannot decide whether to replace & with &amp; or leave it as is: consequently you might have to examine the instances of & in your output text and guess based on the context.

1  $string  String to be edited

tags

Count the number of tags in a parse tree

1  $node  Parse tree

renew

Returns a renewed copy of the parse tree: use this method if you have added nodes via the "Put as text" methods and wish to reprocess them

1  $node  Parse tree

clone

Return a clone of the parse tree: use this method if you want to make temporary changes to a parse tree

1  $node  Parse tree

equals

Decide whether two parse trees are equal or not

1  $node1  Parse tree 1
2  $node2  Parse tree 2

save

Save a copy of the parse tree to a file which can be restored and return the saved node

1  $node  Parse tree
2  $file  File

restore

Return a parse tree from a copy saved in a file by "save" - this is a static method so call it as Data::Edit::Xml::lint(file name)

1  $file  File

Stringification

Create a string representation of the parse tree with optional selection of nodes via conditions

Print

Print the parse tree

string

Return a string representing a node of a parse tree and all the nodes below it

1  $node  Start node

contentString

Return a string representing all the nodes below a node of a parse tree

1  $node  Start node

prettyString

Return a readable string representing a node of a parse tree and all the nodes below it

1  $node   Start node
2  $depth  Depth

PrettyContentString

Return a readable string representing all the nodes below a node of a parse tree - infrequent use and so capitalized to avoid being presented as an option by Geany

1  $node  Start node

Conditions

Print a subset of the the parse tree determined by the conditions attached to it

stringWithConditions

Return a string representing a node of a parse tree and all the nodes below it subject to conditions to select or reject some nodes

1  $node        Start node
2  @conditions  Conditions to be regarded as in effect

addConditions

Add conditions to a node and return the node

1  $node        Node
2  @conditions  Conditions to add

deleteConditions

Delete conditions applied to a node and return the node

1  $node        Node
2  @conditions  Conditions to add

listConditions

Return a list of conditions applied to a node

1  $node  Node

Attributes

Get or set attributes

attr :lvalue

Return the value of an attribute of the current node as an assignable value

1  $node       Node in parse tree
2  $attribute  Attribute name

attrs

Return the values of the specified attributes of the current node

1  $node        Node in parse tree
2  @attributes  Attribute names

attrCount

Return the number of attributes in the specified node

1  $node  Node in parse tree

setAttr

Set the value of an attribute in a node and return the node

1  $node    Node in parse tree
2  %values  (attribute name=>new value)*

deleteAttr

Delete the attribute, optionally checking its value first and return the node

1  $node   Node
2  $attr   Attribute name
3  $value  Optional attribute value to check first

deleteAttrs

Delete any attributes mentioned in a list without checking their values and return the node

1  $node   Node
2  @attrs  Attribute name

renameAttr

Change the name of an attribute regardless of whether the new attribute already exists and return the node

1  $node  Node
2  $old   Existing attribute name
3  $new   New attribute name

changeAttr

Change the name of an attribute unless it has already been set and return the node

1  $node  Node
2  $old   Existing attribute name
3  $new   New attribute name

renameAttrValue

Change the name and value of an attribute regardless of whether the new attribute already exists and return the node

1  $node      Node
2  $old       Existing attribute name and value
3  $oldValue  New attribute name and value
4  $new
5  $newValue

changeAttrValue

Change the name and value of an attribute unless it has already been set and return the node

1  $node      Node
2  $old       Existing attribute name and value
3  $oldValue  New attribute name and value
4  $new
5  $newValue

Traversal

Traverse the parse tree

by

Post-order traversal of a parse tree or sub tree and return the specified starting node

1  $node     Starting node
2  $sub      Sub to call for each sub node
3  @context  Accumulated context

byReverse

Reverse post-order traversal of a parse tree or sub tree and return the specified starting node

1  $node     Starting node
2  $sub      Sub to call for each sub node
3  @context  Accumulated context

down

Pre-order traversal down through a parse tree or sub tree and return the specified starting node

1  $node     Starting node
2  $sub      Sub to call for each sub node
3  @context  Accumulated context

downReverse

Reverse pre-order traversal down through a parse tree or sub tree and return the specified starting node

1  $node     Starting node
2  $sub      Sub to call for each sub node
3  @context  Accumulated context

through

Traverse parse tree visiting each node twice and return the specified starting node

1  $node     Starting node
2  $before   Sub to call when we meet a node
3  $after    Sub to call we leave a node
4  @context  Accumulated context

Contents

Contents of the specified node

contents

Return all the nodes contained by this node either as an array or as a reference to such an array

1  $node  Node

contentBeyond

Return all the nodes following this node at the level of this node

1  $node  Node

contentBefore

Return all the nodes preceding this node at the level of this node

1  $node  Node

contentAsTags

Return a string containing the tags of all the nodes contained by this node separated by single spaces

1  $node  Node

contentBeyondAsTags

Return a string containing the tags of all the nodes following this node separated by single spaces

1  $node  Node

position

Return the index of a node in its parent's content

1  $node  Node

index

Return the index of a node in its parent index

1  $node  Node

present

Return the count of the number of the specified tag types present immediately under a node

1  $node   Node
2  @names  Possible tags immediately under the node

count

Return the count the number of instances of the specified tags under the specified node, either by tag in array context or in total in scalar context

1  $node   Node
2  @names  Possible tags immediately under the node

isText

Confirm that this is a text node

1  $node  Node to test

isBlankText

Confirm that this is a text node and that it is blank

1  $node  Node to test

Move around in the parse tree

get

Return a sub node under the specified node by its position in each index with position zero assumed if no position is supplied

1  $node      Node
2  @position  Position specification: (index position?)* where position defaults to zero if not specified

c

Return an array of all the nodes with the specified tag below the specified node

1  $node  Node
2  $tag   Tag

first

Return the first node below this node

1  $node  Node

firstChild

Return the first instance of each of the specified tags under the specified node

1  $node  Node
2  @tags  Tags to find the first instance of

firstContextOf

Return the first node encountered in the specified context in a depth first post-order traversal of the parse tree

1  $node     Node
2  @context  Array of tags specifying context

last

Return the last node below this node

1  $node  Node

lastContextOf

Return the last node encountered in the specified context in a depth first reverse pre-order traversal of the parse tree

1  $node     Node
2  @context  Array of tags specifying context

next

Return the node next to the specified node

1  $node  Node

nextNonBlank

Return the next node skipping any intervening blank text node

1  $node  Node

prev

Return the node previous to the specified node

1  $node  Node

prevNotBlank

Return the previous node skipping any intervening blank text node

1  $node  Node

upto

Return the first ancestral node that matches the specified context

1  $node  Start node
2  @tags  Tags identifying context

Position

at

Confirm that the node has the specified ancestry

1  $node     Starting node
2  @context  Ancestry

context

Return a string containing the tag of this node and its ancestors separated by single spaces

1  $node  Node

isFirst

Confirm that this node is the first node under its parent

1  $node  Node

isLast

Confirm that this node is the last node under its parent

1  $node  Node

isOnlyChild

Confirm that this node is the only node under its parent

1  $node  Node

isEmpty

Confirm that this node is empty, that is: this node has no content, not even a blank string of text

1  $node  Node

over

Confirm that the string representing the tags at the level below this node match a regular expression

1  $node  Node
2  $re    Regular expression

after

Confirm that the string representing the tags following this node match a regular expression

1  $node  Node
2  $re    Regular expression

before

Confirm that the string representing the tags preceding this node match a regular expression

1  $node  Node
2  $re    Regular expression

Editing

Edit the data in the parse tree

Structure

Change the structure of the parse tree

change

Change the name of a node in an optional tag context and return the node

1  $node  Node
2  $name  New name
3  @tags  Tags defining the context

Wrap and unwrap

wrapWith

Wrap the original node in a new node forcing the original node down deepening the parse tree; return the new wrapping node

1  $old  Node
2  $tag  Tag for new node
wrapUp

Wrap the original node in a sequence of new nodes forcing the original node down deepening the parse tree; return the array of wrapping nodes

1  $node  Node to wrap
2  @tags  Tags to wrap the node with - with the uppermost tag rightmost
wrapDown

Wrap the content of the original node in a sequence of new nodes forcing the original node up deepening the parse tree; return the array of wrapping nodes

1  $node  Node to wrap
2  @tags  Tags to wrap the node with - with the uppermost tag rightmost
wrapContentWith

Wrap the content of a node in a new node, the original content then contains the new node which contains the original node's content; returns the new wrapped node

1  $old  Node
2  $tag  Tag for new node
unwrap

Unwrap a node by inserting its content into its parent at the point containing the node; returns the parent node

1  $node  Node to unwrap

Replace

replaceWith

Replace a node (and all its content) with a new node (and all its content) and return the new node

1  $old  Old node
2  $new  New node
replaceWithText

Replace a node (and all its content) with a new text node and return the new node

1  $old   Old node
2  $text  Text of new node
replaceWithBlank

Replace a node (and all its content) with a new blank text node and return the new node

1  $old  Old node

Cut and Put

Move nodes around in the parse tree

cut

Cut out a node - remove the node from the parse tree and return the node so that it can be put else where

1  $node  Node to cut out

putFirst

Place the new node at the front of the content of the original node and return the new node

1  $old  Original node
2  $new  New node

putLast

Place the new node at the end of the content of the original node and return the new node

1  $old  Original node
2  $new  New node

putNext

Place the new node just after the original node in the content of the parent and return the new node

1  $old  Original node
2  $new  New node

putPrev

Place the new node just before the original node in the content of the parent and return the new node

1  $old  Original node
2  $new  New node

Split a node

Split the content of a node by moving nodes to preceding or following nodes to a preceding or following node

concatenate

Concatenate two successive nodes and return the target node

1  $target  Target node to replace
2  $source  Node to concatenate

splitBack

Move the specified node and all its preceding nodes to a newly created node preceding this node's parent and return the new node (mm July 31, 2017)

1  $old  Move this node and its preceding nodes
2  $new  The name of the new node

splitBackEx

Move all the nodes preceding a specified node to a newly created node preceding this node's parent and return the new node

1  $old  Move all the nodes preceding this node
2  $new  The name of the new node

splitForwards

Move the specified node and all its following nodes to a newly created node following this node's parent and return the new node

1  $old  Move this node and its following nodes
2  $new  The name of the new node

splitForwardsEx

Move all the nodes following a node to a newly created node following this node's parent and return the new node

1  $old  Move the nodes following this node
2  $new  The name of the new node

Put as text

Add text to the parse tree

putFirstAsText

Add a new text node first under a parent and return the new text node

1  $node  The parent node
2  $text  The string to be added which might contain unparsed Xml as well as text

putLastAsText

Add a new text node last under a parent and return the new text node

1  $node  The parent node
2  $text  The string to be added which might contain unparsed Xml as well as text

putNextAsText

Add a new text node following this node and return the new text node

1  $node  The parent node
2  $text  The string to be added which might contain unparsed Xml as well as text

putPrevAsText

Add a new text node following this node and return the new text node

1  $node  The parent node
2  $text  The string to be added which might contain unparsed Xml as well as text

Labels

Additional labels for a node which will be recognized by Data::Edit::Xml::Lint

addLabels

Add the named labels to the specified node and return that node

1  $node    Node in parse tree
2  @labels  Names of labels to add

countLabels

Return the count of the number of labels at a node

1  $node  Node in parse tree

getLabels

Return the names of all the labels set on a node

1  $node  Node in parse tree

deleteLabels

Delete the specified labels in the specified node and return that node

1  $node    Node in parse tree
2  @labels  Names of the labels to be deleted

deleteAllLabels

Delete all the labels in the specified node and return that node

1  $node  Node in parse tree

copyLabels

Copy all the labels from the source node to the target node and return the source node

1  $source  Source node
2  $target  Target node

moveLabels

Move all the labels from the source node to the target node and return the source node

1  $source  Source node
2  $target  Target node

Operators

Operator access to methods use the assign versions to avoid error messages about pointless expression in a void context. Use the non assign versions to return the results of the underlying method call. Thus '/' returns the wrapping node, whilst '/=' does not.

opString

-c : clone, -p : pretty string, -r : renew, -s : string, -t : tag.

1  $node  Node
2  $op    Monadic operator

Example:

-p $x

to print node $x as a pretty string

opContents

@{} : content of a node.

1  $node  Node

Example:

grep {...} @$x

to search the contents of node $x

opOut

>>= : Write a parse tree out on a file.

1  $node  Node
2  $file  File

Example:

$x >>= *STDERR

opContext

<= : Check that a node is in the context specified by the referenced array of words.

1  $node     Node
2  $context  Reference to array of words specifying the parents of the desired node

Example:

$c <= [qw(c b a)]

to confirm that node $c has tag 'c', parent 'b' and grand parent 'a'

opPutFirst

+ or += : put a node or string first under a node.

1  $node  Node
2  $text  Node or text to place first under the node

Example:

my $f = $a + '<p>first</p>'

opPutLast

- : put a node or string last under a node.

1  $node  Node
2  $text  Node or text to place last under the node

Example:

my $l = $a + '<p>last</p>'

opPutNext

> : put a node or string after the current node.

1  $node  Node
2  $text  Node or text to place after the first node

Example:

my $n = $a > '<p>next</p>'

opPutPrev

< : put a node or string before the current node,

1  $node  Node
2  $text  Node or text to place before the first node

Example:

my $p = $a < '<p>next</p>'

opBy

x= : Traverse a parse tree in pre-order.

1  $node  Parse tree
2  $code  Code to execute against each node

Example:

$a x= sub {say -s $_}

to print all the parse trees in a parse tree

opGet

>> : Search for a node via a specification provided as a reference to an array of words each number. Each word represents a tag name, each number the index of the previous tag or zero by default.

1  $node  Node
2  $get   Reference to an array of search parameters

Example:

my $f = $a >> [qw(aa 1 bb)]

to find the first bb under the second aa under $a

opAttr

% : Get the value of an attribute of this node.

1  $node  Node
2  $attr  Reference to an array of words and numbers specifying the node to search for.

Example:

my $a = $x % 'href'

to get the href attribute of the node at $x

opSetTag

+= : Set the tag for a node.

1  $node  Node
2  $tag   Tag

Example:

$a += 'tag'

to change the tag to 'tag' at the node $a

opSetId

-= : Set the id for a node.

1  $node  Node
2  $id    Id

Example:

$a -= 'id'

to change the id to 'id' at node $a

opWrapWith

/ or /= : Wrap node with a tag, returning or not returning the wrapping node.

1  $node  Node
2  $tag   Tag

Example:

$x /= 'aa'

to wrap node $x with a node with a tag of 'aa'

opWrapContentWith

* or *= : Wrap content with a tag, returning or not returning the wrapping node.

1  $node  Node
2  $tag   Tag

Example:

$x *= 'aa'

to wrap the content of node $x with a node with a tag of 'aa'

opCut

-- : Cut out a node.

1  $node  Node

Example:

--$x

to cut out the node $x

opUnWrap

++ : Unwrap a node.

1  $node  Node

Example:

++$x

to unwrap the node $x

Debug

Debugging methods

printAttributes

Print the attributes of a node

1  $node  Node whose attributes are to be printed

checkParentage

Check the parent pointers are correct in a parse tree

1  $x  Parse tree

checkParser

Check that every node has a parser

1  $x  Parse tree

Index

addConditions

addLabels

after

at

attr :lvalue

attrCount

attributes

attrs

before

by

byReverse

c

cdata

change

changeAttr

changeAttrValue

checkParentage

checkParser

clone

concatenate

conditions

content

contentAsTags

contentBefore

contentBeyond

contentBeyondAsTags

contents

contentString

context

copyLabels

count

countLabels

cut

deleteAllLabels

deleteAttr

deleteAttrs

deleteConditions

deleteLabels

down

downReverse

equals

errorsFile

first

firstChild

firstContextOf

get

getLabels

index

indexes

input

inputFile

inputString

isBlankText

isEmpty

isFirst

isLast

isOnlyChild

isText

labels

last

lastContextOf

listConditions

moveLabels

new

newTag

newText

newTree

next

nextNonBlank

opAttr

opBy

opContents

opContext

opCut

opGet

opOut

opPutFirst

opPutLast

opPutNext

opPutPrev

opSetId

opSetTag

opString

opUnWrap

opWrapContentWith

opWrapWith

over

parent

parse

parser

position

present

PrettyContentString

prettyString

prev

prevNotBlank

printAttributes

putFirst

putFirstAsText

putLast

putLastAsText

putNext

putNextAsText

putPrev

putPrevAsText

renameAttr

renameAttrValue

renew

replaceSpecialChars

replaceWith

replaceWithBlank

replaceWithText

restore

save

setAttr

splitBack

splitBackEx

splitForwards

splitForwardsEx

string

stringWithConditions

tag

tags

text

through

unwrap

upto

wrapContentWith

wrapDown

wrapUp

wrapWith

Installation

This module is written in 100% Pure Perl and is thus easy to read, use, modify and install.

Standard Module::Build process for building and installing modules:

perl Build.PL
./Build
./Build test
./Build install

Author

philiprbrenan@gmail.com

http://www.appaapps.com

Copyright

Copyright (c) 2016-2017 Philip R Brenan.

This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.