NAME
DTA::CAB::Format::XmlPerl - Datum parser|formatter: XML (perl-like)
SYNOPSIS
use DTA::CAB::Format::XmlPerl;
##========================================================================
## Constructors etc.
$fmt = DTA::CAB::Format::XmlPerl->new(%args);
##========================================================================
## Methods: Input
$obj = $fmt->parseNode($nod);
$doc = $fmt->parseDocument();
##========================================================================
## Methods: Output
$xmlnod = $fmt->tokenNode($tok);
$xmlnod = $fmt->sentenceNode($sent);
$xmlnod = $fmt->documentNode($doc);
$body_array_node = $fmt->xmlBodyNode();
$sentence_array_node = $fmt->xmlSentenceNode();
$fmt = $fmt->putToken($tok);
$fmt = $fmt->putSentence($sent);
$fmt = $fmt->putDocument($doc);
DESCRIPTION
Globals
- Variable: @ISA
-
DTA::CAB::Format::XmlPerl inherits from DTA::CAB::Format::XmlCommon.
- Filenames
-
DTA::CAB::Format::XmlPerl registers the filename regex:
/\.(?i:xml-perl|perl[\-\.]xml)$/
with DTA::CAB::Format.
Constructors etc.
- new
-
$fmt = CLASS_OR_OBJ->new(%args);
Constructor.
%args, %$fmt:
##-- input xdoc => $xdoc, ##-- XML::LibXML::Document xprs => $xprs, ##-- XML::LibXML parser ## ##-- output encoding => $inputEncoding, ##-- default: UTF-8; applies to output only! level => $level, ##-- output formatting level (default=0) ## ##-- common #(nothing here)
Methods: Persistence
- noSaveKeys
-
@keys = $class_or_obj->noSaveKeys();
Override: returns list of keys not to be saved. Here, returns
qw(xdoc xprs)
.
Methods: Input
- parseNode
-
$obj = $fmt->parseNode($nod);
Returns the perl object represented by the XML::LibXML::Node $nod.
- parseDocument
-
$doc = $fmt->parseDocument();
Override: parses buffered XML::LibXML::Document in $fmt->{xdoc}
Methods: Output
- tokenNode
-
$xmlnod = $fmt->tokenNode($tok);
Returns an XML::LibXML::Node representing the token $tok.
- sentenceNode
-
$xmlnod = $fmt->sentenceNode($sent);
Returns an XML::LibXML::Node representing the sentence $sent.
- documentNode
-
$xmlnod = $fmt->documentNode($doc);
Returns an XML::LibXML::Node representing the document $doc.
- xmlBodyNode
-
$body_array_node = $fmt->xmlBodyNode();
Gets or creates buffered array node representing document body.
- xmlSentenceNode
-
$sentence_array_node = $fmt->xmlSentenceNode();
Gets or creates buffered array node representing (current) document sentence.
- putToken
-
$fmt = $fmt->putToken($tok);
Override: write token $tok to output buffer.
- putSentence
-
$fmt = $fmt->putSentence($sent);
Override: write sentence $sent to output buffer.
- putDocument
-
$fmt = $fmt->putDocument($doc);
Override: write document $doc to output buffer.
EXAMPLE
An example file in the format accepted/generated by this module is:
<?xml version="1.0" encoding="UTF-8"?>
<m ref="DTA::CAB::Document">
<l key="body">
<m>
<a key="lang">de</a>
<l key="tokens">
<m>
<m key="moot">
<a key="lemma">wie</a>
<a key="word">wie</a>
<a key="tag">PWAV</a>
</m>
<l key="lang">
<a>de</a>
</l>
<a key="hasmorph">1</a>
<a key="msafe">1</a>
<a key="text">wie</a>
<a key="exlex">wie</a>
<a key="errid">ec</a>
<m key="xlit">
<a key="latin1Text">wie</a>
<a key="isLatinExt">1</a>
<a key="isLatin1">1</a>
</m>
</m>
<m>
<a key="text">oede</a>
<a key="msafe">0</a>
<m key="moot">
<a key="word">öde</a>
<a key="tag">ADJD</a>
<a key="lemma">öde</a>
</m>
<m key="xlit">
<a key="latin1Text">oede</a>
<a key="isLatin1">1</a>
<a key="isLatinExt">1</a>
</m>
</m>
<m>
<m key="moot">
<a key="lemma">!</a>
<a key="tag">$.</a>
<a key="word">!</a>
</m>
<a key="msafe">1</a>
<a key="text">!</a>
<a key="exlex">!</a>
<a key="errid">ec</a>
<m key="xlit">
<a key="latin1Text">!</a>
<a key="isLatin1">1</a>
<a key="isLatinExt">1</a>
</m>
</m>
</l>
</m>
</l>
</m>
AUTHOR
Bryan Jurish <moocow@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2009-2019 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 502:
Non-ASCII character seen before =encoding in 'key="word">öde</a>'. Assuming UTF-8