NAME
Text::TEI::Collate::Manuscript - represent a manuscript text for collation
DESCRIPTION
Text::TEI::Collate::Manuscript is an object that describes a manuscript.
METHODS
new
Creates a new manuscript object. Right now this is just a container.
tokenize_as_json
Returns a JSON serialization of the Manuscript object, of the form:
{ id: $self->sigil, name: $self->identifier, tokens: [ WORDLIST ] }
where each Word object in the word list is serialized as
{ t: $w->word, c: $w->canonical_form, n: $w->comparison_form,
punctuation: [ $w->punctuation ], placeholders: [ $w->placeholders ] }
This method optionally takes a list of array indices to skip when serializing the wordlist (useful when we want to exclude certain special tokens.)
BUGS / TODO
Many things. Tests for instance. I shall enumerate them later.
AUTHOR
Tara L Andrews <aurum@cpan.org>