Name

Data::Edit::Xml - Edit data held in xml format

Synopsis

Transform some DocBook xml into Dita:

use Data::Edit::Xml;

# Docbook

say STDERR Data::Edit::Xml::new(<<END)
<sli>
 <li>
   <p>Diagnose the problem</p>
   <p>This can be quite difficult</p>
   <p>Sometimes impossible</p>
 </li>
 <li>
 <p><pre>ls -la</pre></p>
 <p><pre>
drwxr-xr-x  2 phil phil   4096 Jun 15  2016 Desktop
drwxr-xr-x  2 phil phil   4096 Nov  9 20:26 Downloads
</pre></p>
 </li>
</sli>
END

# Transform to Dita

->by(sub
 {my ($o, $p) = @_;
  if ($o->at(qw(pre p li sli)) and $o->isOnlyChild)
   {$o->change($p->isFirst ? qw(cmd) : qw(stepresult));
    $p->unwrap;
   }
  elsif ($o->at(qw(li sli))    and $o->over(qr(\Ap( p)+\Z)))
   {$_->change($_->isFirst ? qw(cmd) : qw(info)) for $o->contents;
   }
 })

 ->by(sub
 {my ($o) = @_;
  $o->change(qw(step))          if $o->at(qw(li sli));
  $o->change(qw(steps))         if $o->at(qw(sli));
  $o->id = 's'.($o->position+1) if $o->at(qw(step));
  $o->id = 'i'.($o->index+1)    if $o->at(qw(info));
  $o->wrapWith(qw(screen))      if $o->at(qw(CDATA stepresult));
 })

 # Print
 ->prettyString;

Produces:

<steps>
  <step id="s1">
    <cmd>Diagnose the problem</cmd>
    <info id="i1">This can be quite difficult</info>
    <info id="i2">Sometimes impossible</info>
    </step>
  <step id="s2">
    <cmd>ls -la</cmd>
    <stepresult>
      <screen>
drwxr-xr-x  2 phil phil   4096 Jun 15  2016 Desktop
drwxr-xr-x  2 phil phil   4096 Nov  9 20:26 Downloads
      </screen>
    </stepresult>
  </step>
</steps>

Description

Constructor

new

New parse - call this method statically as in Data::Edit::Xml::new(file or string) or with no parameters and then use "input", "inputFile", "inputString", "errorFile" to provide specific parameters for the parse, then call "parse" to perform the parse and return the parse tree

   Parameter          Description
1  $fileNameOrString  File name or string

parent :lvalue

Parent node of this node or undef if root node, see also "Traversal" and "Navigation". Consider as read only.

parser :lvalue

Parser details: the root node of a tree is the parse node for that tree. Consider as read only.

tag :lvalue

Tag name for this node, see also "Traversal" and "Navigation". Consider as read only.

input :lvalue

Source of the parse if this is the parser node. Use this parameter to specify some input either as a string or as a file name for the parser to convert into a parse tree

inputFile :lvalue

Source file of the parse if this is the parser node. Use this parameter to explicitly set the file to be parsed.

inputString :lvalue

Source string of the parse if this is the parser node. Use this parameter to explicitly set the string to be parsed.

errorsFile :lvalue

Error listing file. Use this parameter to explicitly set the name of the filw that will be used to write an parse errors to, by default this file is named: zzzParseErrors/out.data

text :lvalue

Text of this node but only if it is a text node, i.e. the tag is cdata() <=> "isText" is true

content :lvalue

Content of command: the nodes immediately below this node in the order in which they appeared in the source text, see also "Contents"

attributes :lvalue

The attributes of this node, see also: "Attributes". The frequently used attributes: class, id, href, outputclass can be accessed by an lvalue method as in: $node->id = 'c1'

labels :lvalue

The labels attached to a node to provide addressability from other nodes, see: "Labels".

conditions :lvalue

Conditional strings attached to a node, see "Conditions"

indexes :lvalue

Indexes to sub commands by tag in the order in which they appeared in the source text

cdata

The name of the tag to be used to represent text - this tag must not also be used as a command tag otherwise chaos will occur

parse

Parse input xml

   Parameter  Description
1  $p         Parser created by L</new>

newText

Create a new text node

   Parameter  Description
1  undef      Any reference to this package
2  $text      Content of new text node

newTag

Create a new non text node

   Parameter    Description
1  undef        Any reference to this package
2  $command     The tag for the node
3  %attributes  Attributes as a hash

newTree

Create a new tree - this is a static method

   Parameter    Description
1  $command     The name of the root node in the tree
2  %attributes  Attributes of the root node in the tree as a hash

replaceSpecialChars

< > " with &lt; &gt; &quot; Larry Wall's excellent Xml parser unfortunately replaces &lt; &gt; &quot; &amp; etc. with their expansions in text by default and does not seem to provide an obvious way to stop this behavior, so we have to put them back gain using this method. Worse, we cannot decide whether to replace & with &amp; or leave it as is: consequently you might have to examine the instances of & in your output text and guess based on the context.

   Parameter                                                 Description
1  {$_[0] =~ s/\</&lt;/gr =~ s/\>/&gt;/gr =~ s/\"/&quot;/gr  Replace the special characters that we can replace.

tags

Count the number of tags in a parse tree

   Parameter  Description
1  $node      Parse tree

renew

Returns a renewed copy of the parse tree: use this method if you have added nodes via the "Put as text" methods and wish to reprocess them

   Parameter  Description
1  $node      Parse tree

clone

Return a clone of the parse tree: use this method if you want to make temporary changes to a parse tree

   Parameter  Description
1  $node      Parse tree

equals

Decide whether two parse trees are equal or not

   Parameter  Description
1  $node1     Parse tree 1
2  $node2     Parse tree 2

save

Save a copy of the parse tree to a file which can be restored and return the saved node

   Parameter  Description
1  $node      Parse tree
2  $file      File

restore

Return a parse tree from a copy saved in a file by "save" - this is a static method so call it as Data::Edit::Xml::lint(file name)

   Parameter  Description
1  $file      File

Stringification

Print the parse tree

string

Return a string representing a node of a parse tree and all the nodes below it

   Parameter  Description
1  $node      Start node

contentString

Return a string representing all the nodes below a node of a parse tree

   Parameter  Description
1  $node      Start node

prettyString

Return a readable string representing a node of a parse tree and all the nodes below it

   Parameter  Description
1  $node      Start node
2  $depth     Depth

PrettyContentString

Return a readable string representing all the nodes below a node of a parse tree - infrequent use and so capitialised to avoid being presented as an option by Geany

   Parameter  Description
1  $node      Start node

Conditions

Print a subset of the the parse tree determined by the conditions attached to it

stringWithConditions

Return a string representing a node of a parse tree and all the nodes below it subject to conditions to select or reject some nodes

   Parameter    Description
1  $node        Start node
2  @conditions  Conditions to be regarded as in effect

addConditions

Add conditions to a node and return the node

   Parameter    Description
1  $node        Node
2  @conditions  Conditions to add

deleteConditions

Delete conditions applied to a node and return the node

   Parameter    Description
1  $node        Node
2  @conditions  Conditions to add

listConditions

Return a list of conditions applied to a node

   Parameter  Description
1  $node      Node

Attributes

Get or set attributes

attr :lvalue

Return the value of an attribute of the current node as an assignable value

   Parameter   Description
1  $node       Node in parse tree
2  $attribute  Attribute name

attrs

Return the values of the specified attributes of the current node

   Parameter    Description
1  $node        Node in parse tree
2  @attributes  Attribute names

attrCount

Return the number of attributes in the specified node

   Parameter  Description
1  $node      Node in parse tree

setAttr

Set the value of an attribute in a node and return the node

   Parameter  Description
1  $node      Node in parse tree
2  %values    (attribute name=>new value)*

deleteAttr

Delete the attribute, optionally checking its value first and return the node

   Parameter  Description
1  $node      Node
2  $attr      Attribute name
3  $value     Optional attribute value to check first

deleteAttrs

Delete any attributes mentioned in a list without checking their values and return the node

   Parameter  Description
1  $node      Node
2  @attrs     Attribute name

renameAttr

Change the name of an attribute regardless of whether the new attribute already exists and return the node

   Parameter  Description
1  $node      Node
2  $old       Existing attribute name
3  $new       New attribute name

changeAttr

Change the name of an attribute unless it has already been set and return the node

   Parameter  Description
1  $node      Node
2  $old       Existing attribute name
3  $new       New attribute name

renameAttrValue

Change the name and value of an attribute regardless of whether the new attribute already exists and return the node

   Parameter  Description
1  $node      Node
2  $old       Existing attribute name and value
3  $oldValue  New attribute name and value
4  $new
5  $newValue

changeAttrValue

Change the name and value of an attribute unless it has already been set and return the node

   Parameter  Description
1  $node      Node
2  $old       Existing attribute name and value
3  $oldValue  New attribute name and value
4  $new
5  $newValue

Traversal

Traverse the parse tree

by

Post-order traversal of a parse tree or sub tree and return the specified starting node

   Parameter  Description
1  $node      Starting node
2  $sub       Sub to call for each sub node
3  @context   Accumulated context

byReverse

Reverse post-order traversal of a parse tree or sub tree and return the specified starting node

   Parameter  Description
1  $node      Starting node
2  $sub       Sub to call for each sub node
3  @context   Accumulated context

down

Pre-order traversal down through a parse tree or sub tree and return the specified starting node

   Parameter  Description
1  $node      Starting node
2  $sub       Sub to call for each sub node
3  @context   Accumulated context

downReverse

Reverse pre-order traversal down through a parse tree or sub tree and return the specified starting node

   Parameter  Description
1  $node      Starting node
2  $sub       Sub to call for each sub node
3  @context   Accumulated context

through

Traverse parse tree visiting each node twice and return the specified starting node

   Parameter  Description
1  $node      Starting node
2  $before    Sub to call when we meet a node
3  $after     Sub to call we leave a node
4  @context   Accumulated context

Contents

Contents of the specified node

contents

Return all the nodes contained by this node either as an array or as a reference to such an array

   Parameter  Description
1  $node      Node

contentBeyond

Return all the nodes following this node at the level of this node

   Parameter  Description
1  $node      Node

contentBefore

Return all the nodes preceding this node at the level of this node

   Parameter  Description
1  $node      Node

contentAsTags

Return a string containing the tags of all the nodes contained by this node separated by single spaces

   Parameter  Description
1  $node      Node

contentBeyondAsTags

Return a string containing the tags of all the nodes following this node separated by single spaces

   Parameter  Description
1  $node      Node

position

Return the index of a node in its parent's content

   Parameter  Description
1  $node      Node

index

Return the index of a node in its parent index

   Parameter  Description
1  $node      Node

present

Return the count of the number of the specified tag types present immediately under a node

   Parameter  Description
1  $node      Node
2  @names     Possible tags immediately under the node

count

Return the count the number of instances of the specified tags under the specified node, either by tag in array context or in total in scalar context

   Parameter  Description
1  $node      Node
2  @names     Possible tags immediately under the node

isText

Confirm that this is a text node

   Parameter  Description
1  $node      Node to test

isBlankText

Confirm that this is a text node and that it is blank

   Parameter  Description
1  $node      Node to test

Move around in the parse tree

get

Return a sub node under the specified node by its position in each index with position zero assumed if no position is supplied

   Parameter  Description
1  $node      Node
2  @position  Position specification: (index position?)* where position defaults to zero if not specified

c

Return an array of all the nodes with the specified tag below the specified node

   Parameter  Description
1  $node      Node
2  $tag       Tag

first

Return the first node below this node

   Parameter  Description
1  $node      Node

firstChild

Return the first instance of each of the specified tags under the specified node

   Parameter  Description
1  $node      Node
2  @tags      Tags to find the first instance of

firstContextOf

Return the first node encountered in the specified context in a depth first post-order traversal of the parse tree

   Parameter  Description
1  $node      Node
2  @context   Array of tags specifying context

last

Return the last node below this node

   Parameter  Description
1  $node      Node

lastContextOf

Return the last node encountered in the specified context in a depth first reverse pre-order traversal of the parse tree

   Parameter  Description
1  $node      Node
2  @context   Array of tags specifying context

next

Return the node next to the specified node

   Parameter  Description
1  $node      Node

nextNonBlank

Return the next node skipping any intervening blank text node

   Parameter  Description
1  $node      Node

prev

Return the node previous to the specified node

   Parameter  Description
1  $node      Node

prevNotBlank

Return the previous node skipping any intervening blank text node

   Parameter  Description
1  $node      Node

upto

Return the first ancestral node that matches the specified context

   Parameter  Description
1  $node      Start node
2  @tags      Tags identifying context

Position

at

Confirm that the node has the specified ancestry

   Parameter  Description
1  $node      Starting node
2  @context   Ancestry

context

Return a string containing the tag of this node and its ancestors separated by single spaces

   Parameter  Description
1  $node      Node

isFirst

Confirm that this node is the first node under its parent

   Parameter  Description
1  $node      Node

isLast

Confirm that this node is the last node under its parent

   Parameter  Description
1  $node      Node

isOnlyChild

Confirm that this node is the only node under its parent

   Parameter  Description
1  $node      Node

isEmpty

Confirm that this node is empty, that is: this node has no content, not even a blank string of text

   Parameter  Description
1  $node      Node

over

Confirm that the string representing the tags at the level below this node match a regular expression

   Parameter  Description
1  $node      Node
2  $re        Regular expression

after

Confirm that the string representing the tags following this node match a regular expression

   Parameter  Description
1  $node      Node
2  $re        Regular expression

before

Confirm that the string representing the tags preceding this node match a regular expression

   Parameter  Description
1  $node      Node
2  $re        Regular expression

Editing

Edit the data in the parse tree

change

Change the name of a node in an optional tag context and return the node

   Parameter  Description
1  $node      Node
2  $name      New name
3  @tags      Tags defining the context

Structure

Change the structure of the parse tree

wrapWith

Wrap the original node in a new node forcing the original node down deepening the parse tree; return the new wrapping node

   Parameter  Description
1  $old       Node
2  $tag       Tag for new node

wrapUp

Wrap the original node in a sequence of new nodes forcing the original node down deepening the parse tree; return the array of wrapping nodes

   Parameter  Description
1  $node      Node to wrap
2  @tags      Tags to wrap the node with - with the uppermost tag rightmost

wrapDown

Wrap the content of the original node in a sequence of new nodes forcing the original node up deepening the parse tree; return the array of wrapping nodes

   Parameter  Description
1  $node      Node to wrap
2  @tags      Tags to wrap the node with - with the uppermost tag rightmost

wrapContentWith

Wrap the content of a node in a new node, the original content then contains the new node which contains the original node's content; returns the new wrapped node

   Parameter  Description
1  $old       Node
2  $tag       Tag for new node

unwrap

Unwrap a node by inserting its content into its parent at the point containing the node; returns the parent node

   Parameter  Description
1  $node      Node to unwrap

replaceWith

Replace a node (and all its content) with a new node (and all its content) and return the new node

   Parameter  Description
1  $old       Old node
2  $new       New node

replaceWithText

Replace a node (and all its content) with a new text node and return the new node

   Parameter  Description
1  $old       Old node
2  $text      Text of new node

replaceWithBlank

Replace a node (and all its content) with a new blank text node and return the new node

   Parameter  Description
1  $old       Old node

Cut and Put

Move nodes around in the parse tree

cut

Cut out a node - remove the node from the parse tree and return the node so that it can be put else where

   Parameter  Description
1  $node      Node to cut out

putFirst

Place the new node at the front of the content of the original node and return the new node

   Parameter  Description
1  $old       Original node
2  $new       New node

putLast

Place the new node at the end of the content of the original node and return the new node

   Parameter  Description
1  $old       Original node
2  $new       New node

putNext

Place the new node just after the original node in the content of the parent and return the new node

   Parameter  Description
1  $old       Original node
2  $new       New node

putPrev

Place the new node just before the original node in the content of the parent and return the new node

   Parameter  Description
1  $old       Original node
2  $new       New node

Put as text

Add text to the parse tree

putFirstAsText

Add a new text node first under a parent and return the new text node

   Parameter  Description
1  $node      The parent node
2  $text      The string to be added which might contain unparsed xml as well as text

putLastAsText

Add a new text node last under a parent and return the new text node

   Parameter  Description
1  $node      The parent node
2  $text      The string to be added which might contain unparsed xml as well as text

putNextAsText

Add a new text node following this node and return the new text node

   Parameter  Description
1  $node      The parent node
2  $text      The string to be added which might contain unparsed xml as well as text

putPrevAsText

Add a new text node following this node and return the new text node

   Parameter  Description
1  $node      The parent node
2  $text      The string to be added which might contain unparsed xml as well as text

Labels

Additional labels for a node which will be recognized by Data::Edit::Xml::Lint

addLabels

Add the named labels to the specified node and return that node

   Parameter  Description
1  $node      Node in parse tree
2  @labels    Names of labels to add

countLabels

Return the count of the number of labels at a node

   Parameter  Description
1  $node      Node in parse tree

getLabels

Return the names of all the labels set on a node

   Parameter  Description
1  $node      Node in parse tree

deleteLabels

Delete the specified labels in the specified node and return that node

   Parameter  Description
1  $node      Node in parse tree
2  @labels    Names of the labels to be deleted

deleteAllLabels

Delete all the labels in the specified node and return that node

   Parameter  Description
1  $node      Node in parse tree

copyLabels

Copy all the labels from the source node to the target node and return the source node

   Parameter  Description
1  $source    Source node
2  $target    Target node

moveLabels

Move all the labels from the source node to the target node and return the source node

   Parameter  Description
1  $source    Source node
2  $target    Target node

Operators

Operator access to methods

opString

-c : clone, -p : pretty string, -r : renew, -s : string, -t : tag. Example: -p $x to print node $x as a pretty string

   Parameter  Description
1  $node      Node
2  $op        Monadic operator

opContents

@{} : content of a node. Example: grep {...} @$x to search the contents of node $x

   Parameter  Description
1  $node      Node

opOut

>>= : Write a parse tree out on a file. Example: $x >>= *STDERR

   Parameter  Description
1  $node      Node
2  $file      File

opContext

<= : Check that a node is in the context specified by the referenced array of words. Example: $c <= [qw(c b a)] to confirm that node $c has tag 'c', parent 'b' and grand parent 'a'

   Parameter  Description
1  $node      Node
2  $context   Reference to array of words specifying the parents of the desired node

opPutFirst

+ or += : put a node or string first under a node. Example: my $f = $a + '<p>first</p>'

   Parameter  Description
1  $node      Node
2  $text      Node or text to place first under the node

opPutLast

- : put a node or string last under a node. Example: my $l = $a + '<p>last</p>'

   Parameter  Description
1  $node      Node
2  $text      Node or text to place last under the node

opPutNext

> : put a node or string after the current node. Example: my $n = $a > '<p>next</p>'

   Parameter  Description
1  $node      Node
2  $text      Node or text to place after the first node

opPutPrev

< : put a node or string before the current node, Example: my $p = $a < '<p>next</p>'

   Parameter  Description
1  $node      Node
2  $text      Node or text to place before the first node

opBy

x= : Traverse a parse tree in pre-order. Example: $a x= sub {say -s $_} to print all the parse trees in a parse tree

   Parameter  Description
1  $node      Parse tree
2  $code      Code to execute against each node

opGet

>> : Search for a node via a specification provided as a reference to an array of words each number. Each word represents a tag name, each number the index of the previous tag or zero by default. Example: my $f = $a >> [qw(aa 1 bb)] to find the first bb under the second aa under $a

   Parameter  Description
1  $node      Node
2  $get       Reference to an array of search parameters

opAttr

% : Get the value of an attribute of this node. Example my $a = $x % 'href' to get the href attribute of the node at $x

   Parameter  Description
1  $node      Node
2  $attr      Reference to an array of words and numbers specifying the node to search for.

opSetTag

+= : Set the tag for a node. Example: $a += 'tag' to change the tag to 'tag' at the node $a

   Parameter  Description
1  $node      Node
2  $tag       Tag

opSetId

-= : Set the id for a node. Example: $a -= 'id' to change the id to 'id' at node $a

   Parameter  Description
1  $node      Node
2  $id        Id

opWrapWith

/ or /= : Wrap node with a tag, returning or not returning the wrapping node. Example: $x /= 'aa' to wrap node $x with a node with a tag of 'aa'

   Parameter  Description
1  $node      Node
2  $tag       Tag

opWrapContentWith

* or *= : Wrap content with a tag, returning or not returning the wrapping node. Example: $x *= 'aa' to wrap the content of node $x with a node with a tag of 'aa'

   Parameter  Description
1  $node      Node
2  $tag       Tag

opCut

-- : Cut out a node. Example: --$x to cut out the node $x

   Parameter  Description
1  $node      Node

opUnWrap

++ : Unwrap a node. Example: ++$x to unwrap the node $x

   Parameter  Description
1  $node      Node

Debug

Debugging methods

printAttributes

Print the attributes of a node

   Parameter  Description
1  $node      Node whose attributes are to be printed

checkParentage

Check the parent pointers are correct in a parse tree

   Parameter  Description
1  $x         Parse tree

checkParser

Check that every node has a parser

   Parameter  Description
1  $x         Parse tree

Index

addConditions

addLabels

after

at

attr :lvalue

attrCount

attributes

attrs

before

by

byReverse

c

cdata

change

changeAttr

changeAttrValue

checkParentage

checkParser

clone

conditions

content

contentAsTags

contentBefore

contentBeyond

contentBeyondAsTags

contents

contentString

context

copyLabels

count

countLabels

cut

deleteAllLabels

deleteAttr

deleteAttrs

deleteConditions

deleteLabels

down

downReverse

equals

errorsFile

first

firstChild

firstContextOf

get

getLabels

index

indexes

input

inputFile

inputString

isBlankText

isEmpty

isFirst

isLast

isOnlyChild

isText

labels

last

lastContextOf

listConditions

moveLabels

new

newTag

newText

newTree

next

nextNonBlank

opAttr

opBy

opContents

opContext

opCut

opGet

opOut

opPutFirst

opPutLast

opPutNext

opPutPrev

opSetId

opSetTag

opString

opUnWrap

opWrapContentWith

opWrapWith

over

parent

parse

parser

position

present

PrettyContentString

prettyString

prev

prevNotBlank

printAttributes

putFirst

putFirstAsText

putLast

putLastAsText

putNext

putNextAsText

putPrev

putPrevAsText

renameAttr

renameAttrValue

renew

replaceSpecialChars

replaceWith

replaceWithBlank

replaceWithText

restore

save

setAttr

string

stringWithConditions

tag

tags

text

through

unwrap

upto

wrapContentWith

wrapDown

wrapUp

wrapWith

Installation

This module is written in 100% Pure Perl and is thus easy to read, use, modify and install.

Standard Module::Build process for building and installing modules:

perl Build.PL
./Build
./Build test
./Build install

Author

philiprbrenan@gmail.com

http://www.appaapps.com

Copyright

Copyright (c) 2016-2017 Philip R Brenan.

This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.