NAME

Treex::Core::Block - the basic data-processing unit in the Treex framework

VERSION

version 0.08083

SYNOPSIS

package Treex::Block::My::Block;
use Moose;
use Treex::Core::Common;
extends 'Treex::Core::Block';

sub process_bundle {
   my ( $self, $bundle) = @_;

   # bundle processing

}

DESCRIPTION

Treex::Core::Block is a base class serving as a common ancestor of all Treex blocks. Treex::Core::Block can't be used directly in any scenario. Use it's descendants which implement one of the methods process_document(), process_bundle(), process_zone(), process_[atnp]tree() or process_[atnp]node().

CONSTRUCTOR

my $block = Treex::Block::My::Block->new();

Instance of a block derived from Treex::Core::Block can be created by the constructor (optionally, a reference to a hash of block parameters can be specified as the constructor's argument, see "BLOCK PARAMETRIZATION"). However, it is not likely to appear in your code since block initialization is usually invoked automatically when initializing a scenario.

METHODS FOR BLOCK EXECUTION

You must override one of the following methods:

$block->process_document($document);

Applies the block instance on the given instance of Treex::Core::Document. The default implementation iterates over all bundles in a document and calls process_bundle(). So in most cases you don't need to override this method.

$block->process_bundle($bundle);

Applies the block instance on the given bundle (Treex::Core::Bundle).

$block->process_zone($zone);

Applies the block instance on the given bundle zone (Treex::Core::BundleZone). Unlike process_document and process_bundle, process_zone requires block attribute language (and possibly also selector) to be specified.

$block->process_end();

This method is called after all documents are processed. The default implementation is empty, but derived classes can override it to e.g. print some final summaries, statistics etc. Overriding this method is preferable to both standard Perl END blocks (where you cannot access $self and instance attributes), and DEMOLISH (which is not called in some cases, e.g. treex --watch).

BLOCK PARAMETRIZATION

my $block = BlockGroup::My_Block->new({$name1=>$value1,$name2=>$value2...});

Block instances can be parametrized by a hash containing parameter name/value pairs.

my $param_value = $block->get_parameter($param_name);

Parameter values used in block construction can be revealed by get_parameter method (but cannot be changed).

MISCEL

my $langcode_selector = $block->zone_label();
my $block_name = $block->get_block_name();

It returns the name of the block module.

my @needed_files = $block->get_required_share_files();

If a block requires some files to be present in the shared part of Treex, their list (with relative paths starting in Treex::Core::Config-share_dir|Treex::Core::Config/share_dir>) can be specified by redefining by this method. By default, an empty list is returned. Presence of the files is automatically checked in the block constructor. If some of the required file is missing, the constructor tries to download it from http://ufallab.ms.mff.cuni.cz.

This method should be used especially for downloading statistical models, but not for installed tools or libraries.

sub get_required_share_files {
    my $self = shift;
    return (
        'data/models/mytool/'.$self->language.'/features.gz',
        'data/models/mytool/'.$self->language.'/weights.tsv',
    );
}
require_files_from_share()

This method checks existence of files given as parameters, it tries to download them if they are not present

SEE ALSO

Treex::Core::Node, Treex::Core::Bundle, Treex::Core::Document, Treex::Core::Scenario,

AUTHOR

Zdeněk Žabokrtský <zabokrtsky@ufal.mff.cuni.cz>

Martin Popel <popel@ufal.mff.cuni.cz>

COPYRIGHT AND LICENSE

Copyright © 2011 by Institute of Formal and Applied Linguistics, Charles University in Prague

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.