NAME
TermTagger-brat.pl -- A Perl script for tagging text with terms (Brat format output)
SYNOPSIS
TermTagger.pl [options] corpus termlist selected_term_list lemmatised_corpus
OPTIONS
DESCRIPTION
This script tags a corpus with terms and provide a output compatible with Brat (<http://brat.nlplab.org/>). Corpus (corpus
) is a file with one sentence per line. Term list (termlist
) is a file containing one term per line. For each term, additionnal information (as canonical form) can be given after a column. Each line of the output file (selected_term_list
) contains the sentence number, the term, additional information, all separated by a tabulation character.
==hea1 EXAMPLES
Tag the textual corpus in corpus-test.txt
with terms in the file termlist-test.lst
and record the results in the file corpus-test.ann
) according to the Brat input format:
TermTagger-brat.pl corpus-test.txt termlist-test.lst corpus-test.ann
SEE ALSO
Alvis web site: http://www.alvis.info
Brat: http://brat.nlplab.org/
AUTHORS
Thierry Hamon <thierry.hamon@limsi.fr>
LICENSE
Copyright (C) 2006 by Thierry Hamon
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.