NAME

Text::TEI::Collate - a collation program for variant manuscript texts

SYNOPSIS

use Text::TEI::Collate;
my $aligner = Text::TEI::Collate->new( 'language' => 'Armenian' );

# Read from strings.
my @manuscripts;
foreach my $str ( @strings_to_collate ) {
  push( @manuscripts, $aligner->read_source( $str ) );
}
$aligner->align( @manuscripts; );

# Read from files.  Also works for XML::LibXML::Document objects.
@manuscripts = ();
foreach my $xml_file ( @TEI_files_to_collate ) {
  push( @manuscripts, $aligner->read_source( $xml_file ) )
}
$aligner->align( @manuscripts );

# Read from a JSON input.
@manuscripts = $aligner->read_source( $JSON_string );
$aligner->align( @manuscripts );

DESCRIPTION

Text::TEI::Collate is a collation program for multiple (transcribed) manuscript copies of a known text. It is an object-oriented interface, mostly for the convenience of the author and for the ability to have global settings.

The object is the alignment engine, or "aligner". The methods that a user will care about are "read_source" and "align", as well as the various output methods; the other methods in this file are public in case a user needs a subset of this package's functionality.

An aligner takes two or more texts; the texts can be strings, filenames, or XML::LibXML::Document objects. It returns two or more Manuscript objects -- one for each text input -- in which identical and similar words are lined up with each other, via empty-string padding.

Please see the documentation for Text::TEI::Collate::Manuscript and Text::TEI::Collate::Word for more information about the manuscript and word objects.

METHODS

new

Creates a new aligner object. Takes a hash of options; available options are listed.

debuglevel - Default 0. The higher the number (between 0 and 3), the more the debugging output.
title - Display title for the collation output results, should those results need a display title (e.g. TEI or JSON output).
language - Specify the language module we should use from those available in Text::TEI::Collate::Lang. Default is 'Default'.
fuzziness - The maximum allowable word distance for an approximate match, expressed as a percentage of word distance / word length. It can also be expressed as a hashref with keys 'val', 'short', and 'shortval', if you want to increase the tolerance for short words (defined as at or below the value of 'short').
binmode - If STDERR should be using something other than UTF-8, you can set it here. You are probably in for a world of hurt anyway though.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 865:: Non-ASCII character seen before =encoding in ''հարիւրից''. Assuming UTF-8

To install Text::TEI::Collate, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Text::TEI::Collate

CPAN shell

perl -MCPAN -e shell
install Text::TEI::Collate

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

DESCRIPTION

METHODS

new

Module Install Instructions