NAME

Lingua::PT::PLN - Perl extension for simple natural language processing, portuguese language

SYNOPSIS

use Lingua::PT::PLN;

printPN(@options);
printPNstring({ %options... } ,$textstrint);
printPNstring([ @options... ] ,$textstrint);

forPN( sub{my ($pn, $contex)=@_;... } ) ;
forPN( {p=>"double"}, sub{my ($pn, $contex)=@_;... }, sub{...} ) ;

forPNstring(sub{my ($pn, $contex)=@_;... } ,$textstring, regsep) ;

$st = syllabe($phrase);
$s = accent($phrase);
$s = wordaccent($word);

$s = xmlsentences($textstring);
$s = xmlsentences({st=>"frase"},$textstring);
@s = sentences($textstring);

%o = oco("infile1", "infile2");
     oco({num=>1,output=>"file"}, "infile1", "infile2");
%o = oco({from=>"string"},"string1");

perl -MLingua::PT::PLN -e 'cqptokens("file")' > out

DESCRIPTION

oco([options,], file*)

Option num=1 means sorted by number of ocorrences.

Option alpha=1 means sorted lexicografically.

Option output=f means write output to f

Option from="string" means that input is a string instead of a file

oco({num=>1,output=>"f"}, f1,f2,...)
oco({alpha=>1,output=>"f"}, f1,f2,...)
%oc=oco( f1,f2,...)
%oc=oco( {from=>"string"},"text in a string")

forPN( $funref )

Substitutes all propername by funref(propername) in STDIN and sends output to STDOUT

Opcionally you can pass {t = "full"}> as first parameter to obtain names after "."

forPN({in=> inputfile(sdtin), out => file(stdout)}, sub{...})
forPN({sep=>"\n", t=>"normal"}, sub{...})
forPN({sep=>'', t=>"double"}, sub{...}, sub{...})

forPNstring( $funref, "textstring" [, regSeparator] )

Substitutes all propername by funref(propername) in the text string.

printPNstring(options)

printPN("oco")

printPNstring("oco")

syllabe( $phrase )

Returns the phrase with the syllabes separated by "|"

accent( $phrase )

Returns the phrase with the syllabes separated by "|" and accents marked with the charater ".

cqptokens()

cpqtokens - encodes a text from STDIN for CQP (one token per line)

sentences()

sentences - ....

xmlsentences()

xmlsentences - ....

By default, sentences are marked with "s". To change this use st optional parameter. Example:

xmlsentences({st=> "tag"}, text) 

to mark sentences with tag "tag".

AUTHOR

José João Almeida (jj@di.uminho.pt)

Alberto Simões (albie@alfarrabio.di.uminho.pt)

Paulo Rocha (paulo.rocha@di.uminho.pt)

thanks to

Diana Santos

SEE ALSO

perl(1).

cqp(1).

1 POD Error

The following errors were encountered while parsing the POD:

Around line 684:

Non-ASCII character seen before =encoding in 'José'. Assuming CP1252