NAME
Lingua::PT::ProperNames - Simple module to extract proper names from Portuguese Text
Version
Version 0.08
Synopsis
This module contains simple Perl-based functions to detect and extract proper names from Portuguese text.
use Lingua::PT::ProperNames;
printPN(@options);
printPNstring({ %options... } ,$textstrint);
printPNstring([ @options... ] ,$textstrint);
forPN( sub{my ($pn, $contex)=@_;... } ) ;
forPN( {t=>"double"},
sub{my ($pn, $contex)=@_;... }, sub{...} ) ;
$outstr = forPN($instr, sub{my ($pn, $contex)=@_;... }, ... ) ;
forPNstring(sub{my ($pn, $contex)=@_;... },
$textstring, regsep) ;
my $pndict = Lingua::PT::ProperNames->new;
ProperNames dictionary
new
Creates a new ProperNames dictionary
is_name
This method checks if a name exists in the Names dictionary.
is_surname
Thie method checks if a name exists in the Names dictionary as a Surname.
Export the following functions
forPN
Substitutes all propername
by <funref-
($propername,$context)>> in STDIN and sends output to STDOUT
Usage:
forPN({options...}, sub{ propername processor...})
Optionally you can define input or output files:
forPN({in=> "inputfile", out => "outputfile" }, sub{...})
Optionally you can use option type : <{t =
"double"}>> to have special treatment for process names after pontuation (".", etc). With this options you must provide 2 functions: one for normal propernames and one for names after pontuation.
forPN({t=>"double"}, sub{...}, sub{...})
You can also define record paragraph separator
forPN({sep=>"\n", t=>"normal"}, sub{...}) ## each line is a par.
forPN({sep=>""}, sub{...}) ## par. empty lines
forPNstring
forPNstring( $funref, "textstring" [, regSeparator] )>
Substitutes all propername
by funref(propername)
in the text string.
printPNstring
printPNstring("oco")
getPN
printPN
printPN("oco")
printPN - extrai os nomes próprios dum texto.
-comp junta certos nomes: Fermat + Pierre de Fermat = (Pierre de) Fermat
-prof
-e "Sebastiao e Silva" "e" como pertencente a PN
-em "em Famalicão" como pertencente a PN
Author
José João Almeida, <jj@di.uminho.pt>
Alberto Simões, <ambs@di.uminho.pt>
Bugs
NOTE: We know documentation for exported methods is inexistent. We are working on that for very soon.
Please report any bugs or feature requests to bug-lingua-pt-propernames@rt.cpan.org
, or through the web interface at http://rt.cpan.org. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
COPYRIGHT & LICENSE
Copyright 2004 Alberto Simões, All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 440:
Non-ASCII character seen before =encoding in 'próprios'. Assuming CP1252