NAME

Treex::Block::W2A::Tag - universal block for PoS tagging and lemmatization

VERSION

version 2.20151102

SYNOPSIS

# ==== from command line (W2A::Tag in scenario) ====
echo "Hello there" | treex -t \
 W2A::Tag module=Treex::Tool::Tagger::Simple::XY lemmatize=1 \
 Write::CoNLLX

# ==== creating a derived class ====
package Treex::Block::W2A::XY::TagSimple;
use Moose; use Treex::Core::Common;
use Treex::Tool::Tagger::Simple::XY;
extends 'Treex::Block::W2A::Tag';

# If the tool needs a module, set a default
has model => (is => 'ro', default => 'data/models/tagger/simple/xy.model');

# Override the builder, so $self->tagger is an instance of Treex::Tool::Tagger::Simple::XY
sub _build_tagger{
  my ($self) = @_;
  # $self->_args is a hashref of all parameters passed to this block (from scenario).
  # The tool usually needs just model, but this way it is easy to add new parameters
  # (e.g. mem=1g) to the tool without changing this block.
  $self->_args->{model} = $self->model;
  return Treex::Tool::Tagger::Simple::XY->new($self->_args);
}
1; # add POD and that's all :-)

DESCRIPTION

This class serves two purposes:

- It is a base class for all other PoS tagging blocks.

- It can be used directly in the scenario with specifying the name of the tagger tool in the parameter module.

Lemmatization

Some taggers do lemmatization together with parsing. Some cannot lemmatize. Some can choose whether to lemmatize or not and in that case the tagger may use less resources. Therefore, this block (and derived classes) has parameter lemmatize - if set to 0, no lemmas are filled in the trees (even if returned by the tagger tool); - if set to 1, the tagger should either lemmatize all sentences or fail (via log_fatal) during the inicialization, if it does not support lemmatization.

SEE ALSO

Treex::Tool::Tagger::Role

COPYRIGHT AND LICENCE

Copyright 2011-2012 Martin Popel

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.