NAME
Tree::Lexicon - Object class for storing and retrieving a lexicon in a tree of affixes
VERSION
Version 0.01
SYNOPSIS
use Tree::Lexicon;
my $lexicon = Tree::Lexicon->new();
$lexicon->insert( 'apply', '', 'Apple', 'Windows', 'Linux', 'app', 'all day' );
# Warns of strings not matching /^\w+/ without inserting
if ($lexicon->contains( 'WiNdOwS' )) {
$lexicon->remove( 'wInDoWs' );
$lexicon->insert( 'Vista' );
}
my @words = $lexicon->vocabulary;
# Same as:
@words = ( 'Apple', 'Linux', 'Windows', 'app', 'apply' );
@words = $lexicon->auto_complete( 'ap' );
# Same as:
@words = ( 'app', 'apply' );
my $regexp = $lexicon->as_regexp();
# Same as:
$regexp = qr/\b(?:Apple|Linux|Windows|app(?:ly)?)\b/;
my $caseless->Tree::Lexicon->new( 0 )->insert( 'apply', '', 'Apple', 'Windows', 'Linux', 'app', 'all day' );
# Warns of strings not matching /^\w+/ without inserting
if ($caseless->contains( 'WiNdOwS' )) {
$caseless->remove( 'wInDoWs' );
$caseless->insert( 'Vista' );
}
@words = $caseless->vocabulary;
# Same as:
@words = ( 'APP', 'APPLE', 'APPLY', 'LINUX', 'VISTA' );
@words = $caseless->auto_complete( 'ap' );
# Same as:
@words = ( 'APP', 'APPLE', 'APPLY' );
my $regexp = $caseless->as_regexp();
# Same as:
$regexp = qr/\b(?:[Aa][Pp[Pp](?:[Ll](?:[Ee]|[Yy]))?|[Ll][Ii][Nn][Uu][X]|[Vv][Ii][Ss][Tt][Aa])\b/;
use Tree::Lexicon qw( cs_regexp ci_regexp );
my $cs_regexp = cs_regexp( @words );
# Same as:
$cs_regexp = Tree::Lexicon->new()->insert( @words )->as_regexp();
my $ci_regexp = ci_regexp( @words );
# Same as:
$ci_regexp = Tree::Lexicon->new( 0 )->insert( @words )->as_regexp();
DESCRIPTION
The purpose of this module is to provide a simple and effective means to store a lexicon. It is intended to aid parsers in identifying keywords and interactive applications in identifying user-provided words.
EXPORT
cs_regexp
Convenience function for generating a case sensitive regular expression from list of words.
my $cs_regexp = cs_regexp( @words );
# Same as:
$cs_regexp = Tree::Lexicon->new( 1 )->insert( @words )->as_regexp();
ci_regexp
Convenience function for generating a case insensitive regular expression from list of words.
my $ci_regexp = cs_regexp( @words );
# Same as:
$ci_regexp = Tree::Lexicon->new( 0 )->insert( @words )->as_regexp();
METHODS
Passing a string not matching /^\w+/
as an argument to insert
, remove
, contains
or auto_complete
yields a warning to STDERR and nothing else.
new
Returns a new empty Tree::Lexicon
object. By default, the tree's contents are case-sensitive. Passing a single false argument to the constuctor makes its contents case-insensitive.
$lexicon = Tree::Lexicon->new();
# Same as:
$lexicon = Tree::Lexicon->new( 1 );
# or #
$lexicon = Tree::Lexicon->new( 0 );
insert
Inserts zero or more words into the lexicon tree and returns the object.
$lexicon->insert( 'list', 'of', 'words' );
If you already have an initial list of words, then you can chain this method up with the constructor.
my $lexicon = Tree::Lexicon->new()->insert( @words );
remove
Removes zero or more words from the lexicon tree and returns them (or undef
if not found).
@removed = $lexicon->remove( 'these', 'words' );
contains
Returns 1
or ''
for each word as to its presence or absense, respectively.
@verify = $lexicon->contains( 'these', 'words' );
auto_complete
Returns all words beginning with the string passed.
@words = $lexicon->auto_complete( 'a' );
vocabulary
Returns all words in the lexicon.
@words = $lexicon->vocabulary();
as_regexp
Returns a regular expression equivalent to the lexicon tree. The regular expression has the form qr/\b(?: ... )\b/
.
$regexp = $lexicon->as_regexp();
AUTHOR
S. Randall Sawyer, <srandalls at cpan.org>
BUGS
Please report any bugs or feature requests to bug-tree-lexicon at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Tree-Lexicon. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Tree::Lexicon
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGMENTS
This module's framework generated with module-starter
.
LICENSE AND COPYRIGHT
Copyright 2013 S. Randall Sawyer.
This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: