NAME

DTA::TokWrap::Processor::standoff::xsl - DTA tokenizer wrappers: t.xml -> (s.xml, w.xml, a.xml) via XSL

SYNOPSIS

use DTA::TokWrap::Processor::standoff::xsl;

$so = DTA::TokWrap::Processor::standoff::xsl->new(%opts);
$doc_or_undef = $so->sosxml($doc);
$doc_or_undef = $so->sowxml($doc);
$doc_or_undef = $so->soaxml($doc);
$doc_or_undef = $so->standoff($doc);

##-- debugging
undef = $so->dump_t2s_stylesheet($filename_or_fh);
undef = $so->dump_t2w_stylesheet($filename_or_fh);
undef = $so->dump_t2a_stylesheet($filename_or_fh);

DESCRIPTION

This module is deprecated; prefer DTA::TokWrap::Processor::standoff.

DTA::TokWrap::Processor::standoff::xsl provides an object-oriented DTA::TokWrap::Processor wrapper for generation of various standoff XML formats for DTA::TokWrap::Document objects via (slow) XSL stylesheet transformations.

Most users should use the high-level DTA::TokWrap wrapper class instead of using this module directly.

Constants

@ISA

DTA::TokWrap::Processor::standoff::xsl inherits from DTA::TokWrap::Processor.

Constructors etc.

new
$so = $CLASS_OR_OBJECT->new(%args);

Constructor.

%args, %$so:

##-- Stylesheet: tx2sx (t.xml -> s.xml)
t2s_stylestr  => $stylestr,           ##-- xsl stylesheet string
t2s_styleheet => $stylesheet,         ##-- compiled xsl stylesheet
##
##-- Styleheet: tx2wx (t.xml -> w.xml)
t2w_stylestr  => $stylestr,           ##-- xsl stylesheet string
t2w_styleheet => $stylesheet,         ##-- compiled xsl stylesheet
##
##-- Styleheet: tx2wx (t.xml -> a.xml)
t2a_stylestr  => $stylestr,           ##-- xsl stylesheet string
t2a_styleheet => $stylesheet,         ##-- compiled xsl stylesheet
defaults
%defaults = CLASS->defaults();

Static class-dependent defaults.

init
$so = $so->init();

Dynamic object-dependent defaults.

Methods: XSL stylesheets

Low-level utility methods.

The stylesheets returned may or may not accurately reflect the documents generated by the sosxml(), sowxml(), and soaxml() methods.

ensure_stylesheets
$so_or_undef = $so->ensure_stylesheets();

Ensures that required XSL stylesheets have been compiled.

t2s_stylestr
$xsl_str = $mbx0->t2s_stylestr();

Returns XSL stylesheet string for generation of sentence-level standoff XML (.s.xml) from "master" tokenized XML (.t.xml).

t2w_stylestr
$xsl_str = $mbx0->t2w_stylestr();

Returns XSL stylesheet string for generation of token-level standoff XML (.w.xml) from "master" tokenized XML (.t.xml).

t2a_stylestr
$xsl_str = $mbx0->t2a_stylestr();

Returns XSL stylesheet string for generation of token-analysis-level standoff XML (.a.xml) from "master" tokenized XML (.t.xml).

dump_t2s_stylesheet
$so->dump_t2s_stylesheet($filename_or_fh);

Dumps the generated sentence-level standoff stylesheet to $filename_or_fh.

dump_t2w_stylesheet
$so->dump_t2w_stylesheet($filename_or_fh);

Dumps the generated token-level standoff stylesheet to $filename_or_fh.

dump_t2a_stylesheet
$so->dump_t2a_stylesheet($filename_or_fh);

Dumps the generated token-analysis-level standoff stylesheet to $filename_or_fh.

Methods: top-level

standoff
$doc_or_undef = $CLASS_OR_OBJECT->standoff($doc);

Wrapper for sosxml(), sowxml(), soaxml().

sosxml
$doc_or_undef = $CLASS_OR_OBJECT->sosxml($doc);

Generate sentence-level standoff for the DTA::TokWrap::Document object $doc.

Relevant %$doc keys:

xtokdoc  => $xtokdoc,  ##-- (input) XML-ified tokenizer output data, as XML::LibXML::Document
xtokdata => $xtokdata, ##-- (input) fallback: string source for $xtokdoc
sosdoc   => $sosdoc,   ##-- (output) standoff sentence data, refers to $doc->{sowfile}
##
sosxml_stamp0 => $f,   ##-- (output) timestamp of operation begin
sosxml_stamp  => $f,   ##-- (output) timestamp of operation end
sosdoc_stamp => $f,    ##-- (output) timestamp of operation end
sowxml
$doc_or_undef = $CLASS_OR_OBJECT->sowxml($doc);

Generate token-level standoff for the DTA::TokWrap::Document object $doc.

Relevant %$doc keys:

xtokdoc  => $xtokdoc,  ##-- (input) XML-ified tokenizer output data, as XML::LibXML::Document
xtokdata => $xtokdata, ##-- (input) fallback: string source for $xtokdoc
sowdoc   => $sowdoc,   ##-- (output) standoff token data, refers to $doc->{xmlfile}
##
sowxml_stamp0 => $f,   ##-- (output) timestamp of operation begin
sowxml_stamp  => $f,   ##-- (output) timestamp of operation end
sowdoc_stamp => $f,    ##-- (output) timestamp of operation end
soaxml
$doc_or_undef = $CLASS_OR_OBJECT->soaxml($doc);

Generate token-analysis-level standoff for the DTA::TokWrap::Document object $doc.

Relevant %$doc keys:

xtokdoc  => $xtokdoc,  ##-- (input) XML-ified tokenizer output data, as XML::LibXML::Document
xtokdata => $xtokdata, ##-- (input) fallback: string source for $xtokdoc
soadoc   => $soadoc,   ##-- (output) standoff token-analysis data, refers to $doc->{sowdoc}
##
sowxml_stamp0 => $f,   ##-- (output) timestamp of operation begin
sowxml_stamp  => $f,   ##-- (output) timestamp of operation end
sowdoc_stamp => $f,    ##-- (output) timestamp of operation end

SEE ALSO

DTA::TokWrap::Intro(3pm), dta-tokwrap.perl(1), ...

SEE ALSO

DTA::TokWrap::Intro(3pm), dta-tokwrap.perl(1), ...

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2009-2018 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.