NAME

Lingua::Stem::Es - Perl Spanish Stemming

SYNOPSIS

use Lingua::Stem::Es;

my $stems = Lingua::Stem::Es::stem({ -words => $word_list_reference,
                                     -locale => 'es',
                                     -exceptions => $exceptions_hash,
                                  });

my $stem = Lingua::Stem::Es::stem_word( $word );

DESCRIPTION

This module uses Porter's Stemming Algorithm to return an array reference of stemmed words.

The algorithm is implemented as described in:

http://snowball.tartarus.org/algorithms/spanish/stemmer.html

The interface was made to follow the conventions set by the Lingua::Stem module by Benjamin Franz. This spanish version is based on the work of Sébastien Darribere-Pleyt (French Version).

METHODS

stem({ -words => \@words, -locale => 'es', -exceptions => \%exceptions });

Stems a list of passed words. Returns an anonymous list reference to the stemmed words. Note that -locale is not necessary, as this module does not uses it and it defaults to 'es' anyway. '\%exceptions' keys are words that should not be processed, and the values of this hash are returned in the resulting array reference.

Example:

my $stemmed_words = Lingua::Stem::Es::stem({ 
    -words => \@words,
    -locale => 'es',
    -exceptions => \%exceptions,
});
stem_word( $word );

Stems a single word and returns the stem directly.

Example:

my $stem = Lingua::Stem::Es::stem_word( $word );
stem_caching({ -level => 0|1|2 });

Sets the level of stem caching.

'0' means 'no caching'. This is the default level.

'1' means 'cache per run'. This caches stemming results during a single call to 'stem'.

'2' means 'cache indefinitely'. This caches stemming results until either the process exits or the 'clear_stem_cache' method is called.

clear_stem_cache;

Clears the cache of stemmed words.

SEE ALSO

You can see the Spanish stemming algorithm from Mr Porter here :

http://snowball.tartarus.org/algorithm/spanish/stemmer.html

I took from his site the voc.txt and output.txt files that are included in this distribution, for testing. Those two files were released under the BSD License: http://snowball.tartarus.org/license.php and are therefore bound to it.

AUTHOR

Julio Fraire, <julio.fraire@gmail.com>

COPYRIGHT AND LICENSE

Copyright (c) 2001, Dr Martin Porter http://snowball.tartarus.org/

Copyright (C) 2004 by Sébastien Darribere-Pleyt <sebastien.darribere@lefute.com>

Copyright (C) 2008 by Julio Fraire, <julio.fraire@gmail.com>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 477:

Non-ASCII character seen before =encoding in 'Sébastien'. Assuming CP1252