NAME

DTA::CAB::Format::Perl - Datum parser|formatter: perl code via Data::Dumper, eval()

SYNOPSIS

use DTA::CAB::Format::Perl;

$fmt = DTA::CAB::Format::Perl->new(%args);

##========================================================================
## Methods: Input

$fmt = $fmt->close();
$fmt = $fmt->parsePerlString($str);
$doc = $fmt->parseDocument();

##========================================================================
## Methods: Output

$fmt = $fmt->flush();
$str = $fmt->toString();
$fmt = $fmt->putToken($tok);
$fmt = $fmt->putSentence($sent);
$fmt = $fmt->putDocument($doc);

DESCRIPTION

DTA::CAB::Format::perl is a DTA::CAB::Format datum parser/formatter which reads & writes data as perl code via eval() and Data::Dumper respectively.

Globals

Variable: @ISA

DTA::CAB::Format::Perl inherits from DTA::CAB::Format.

Filenames

DTA::CAB::Format::Perl registers the filename regex:

/\.(?i:prl|pl|perl|dump)$/

with DTA::CAB::Format.

Constructors etc.

new
$fmt = CLASS_OR_OBJ->new(%args);

Constructor.

%args, %$fmt:

##---- Input
doc    => $doc,                 ##-- buffered input document
##
##---- Output
dumper => $dumper,              ##-- underlying Data::Dumper object
##
##---- INHERITED from DTA::CAB::Format
#encoding => $encoding,         ##-- n/a
level     => $formatLevel,      ##-- sets Data::Dumper->Indent() option
outbuf    => $stringBuffer,     ##-- buffered output

Methods: Persistence

noSaveKeys
@keys = $class_or_obj->noSaveKeys();

Override returns list of keys not to be saved. This implementation returns qw(doc outbuf).

Methods: Input

close
$fmt = $fmt->close();

Override: close currently selected input source.

fromString
$fmt = $fmt->fromString($string)

Override: select input from the string $string.

parsePerlString
$fmt = $fmt->parsePerlString($str);

Evaluates $str as perl code, which is expected to return a DTA::CAB::Document object (or something which can be massaged into one), and sets $fmt->{doc} to this new document object.

parseDocument
$doc = $fmt->parseDocument();

Returns the current contents of $fmt->{doc}, e.g. the most recently parsed document.

Methods: Output

flush
$fmt = $fmt->flush();

Override: flush accumulated output.

toString
$str = $fmt->toString();
$str = $fmt->toString($formatLevel)

Override: flush buffered output document to byte-string. This implementation just returns $fmt->{outbuf}, which should already be a byte-string, and has no need of encoding.

putToken
$fmt = $fmt->putToken($tok);

Override: writes a token to the output buffer (non-destructive on $tok).

putSentence
$fmt = $fmt->putSentence($sent);

Override: write a sentence to the outupt buffer (non-destructive on $sent).

putDocument
$fmt = $fmt->putDocument($doc);

Override: write a document to the outupt buffer (non-destructive on $doc).

EXAMPLE

An example file in the format accepted/generated by this module is:

$document = bless( {
  'body' => [
    {
      'tokens' => [
        {
          'xlit' => {
            'isLatin1' => '1',
            'latin1Text' => 'wie',
            'isLatinExt' => '1'
          },
          'text' => 'wie',
          'hasmorph' => '1',
          'moot' => {
            'tag' => 'PWAV',
            'word' => 'wie',
            'lemma' => 'wie'
          },
          'msafe' => '1',
          'exlex' => 'wie',
          'errid' => 'ec',
          'lang' => [
            'de'
          ]
        },
        {
          'xlit' => {
            'isLatinExt' => '1',
            'latin1Text' => 'oede',
            'isLatin1' => '1'
          },
          'msafe' => '0',
          'moot' => {
            'word' => "\x{f6}de",
            'tag' => 'ADJD',
            'lemma' => "\x{f6}de"
          },
          'text' => 'oede'
        },
        {
          'errid' => 'ec',
          'xlit' => {
            'isLatinExt' => '1',
            'latin1Text' => '!',
            'isLatin1' => '1'
          },
          'exlex' => '!',
          'moot' => {
            'word' => '!',
            'tag' => '$.',
            'lemma' => '!'
          },
          'msafe' => '1',
          'text' => '!'
        }
      ],
      'lang' => 'de'
    }
  ]
}, 'DTA::CAB::Document' );
$document

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2009-2019 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.