NAME
Map::Tube::Plugin::FuzzyFind - Map::Tube add-on for finding stations and lines by inexact name.
VERSION
Version 0.81.0
DESCRIPTION
This is an add-on for Map::Tube to find stations and lines by name, possibly partly or inexactly specified. The module is a Moo role which gets plugged into the Map::Tube::* family automatically once it is installed.
SYNOPSIS
use strict; use warnings;
use Map::Tube::London;
my $tube = Map::Tube::London->new();
print 'line matches exactly: ',
scalar( $tube->fuzzy_find( search => 'erloo',
objects => 'lines' ) ), "\n";
print 'line contains : ',
scalar( $tube->fuzzy_find( search => 'erloo',
objects => 'lines',
method => 'in' ) ), "\n";
print 'same thing : ',
scalar( $tube->fuzzy_find( 'erloo',
objects => 'lines',
method => 'in' ) ), "\n";
print 'same thing : ',
scalar( $tube->fuzzy_find( { search => 'erloo',
objects => 'lines',
method => 'in' } ) ), "\n";
print 'station re : ',
join( ' ', $tube->fuzzy_find( search => qr/[htrv]er/i,
objects => 'stations' ) ), "\n";
print 'station re : ',
join( ' ', $tube->fuzzy_find( search => '[htrv]er',
objects => 'stations',
method => 'regex' ) ), "\n";
print 'line fuzzy : ',
scalar( $tube->fuzzy_find( search => 'Kyrkle',
objects => 'stations',
method => 'levenshtein' ) ), "\n";
METHODS
fuzzy_find(%args)
Find a tube line or station by some pattern, which may be partly or inexactly specified. In array context, a (possibly empty) array of Map::Tube::{Line,Node} objects is returned that match the pattern. If the matching method employed provides a measure of similarity, the result set will be ordered by decreasing similarity. Otherwise, it will be ordered alphabetically. In scalar context, a Map::Tube::{Line,Node} object (or undef) is returned. In the case of more than one match, the most similar or the alphabetically first match will be returned, as applicable to the matching method.
%args
is a hash of optional named parameters to guide actions. It may be specified as a hash or as a reference to a hash. For convenience, the search pattern may be specified as the first argument, outside the hash. While formally all arguments are optional, not specifying a search pattern will, predictably, not produce any exciting result.
- search=...
-
The pattern to be searched for. It may be a string or a (possibly precompiled) regular expression. The latter requires the matching method (cf. below) to be
'regex'
. - objects=...
-
This specifies whether stations or lines should be found. The value should be either
'lines'
or'stations'
. ('nodes'
is a synonym for'stations'
.) If it is none of these, then both lines and stations will be searched. - method=...
-
The method for matching. If this parameter is missing, the default is to use
'regex'
if the pattern is a precompiled regex, or'exact'
otherwise. Otherwise, the value should be one of the following.- 'exact'
-
Exact matching will be performed (up to case). This is also the default method if only a search string (without any further arguments) is supplied.
- 'start'
-
The given string pattern must match at the beginning of the line or station name (up to case).
- 'in'
-
The given string pattern must match somewhere within the line or station name (up to case).
- 're' or 'regex'
-
The given pattern is matched as a regex (case-insensitively) against the line or station names. The pattern may also be specified as a precompiled regex. In this case, its case sensitivity will be used unaltered.
- 'soundex'
-
All names matching according to the soundex algorithm (see Text::Soundex) will be returned. Actually, a variant of Donald E. Knuth's original algorithm is used which also tries to cope with non-ASCII characters. It works well only for English-like words.
- 'phonix'
-
All names matching according to the phonix algorithm (see Text::Phonetic::Phonix) will be returned. This is an alternative to soundex.
- 'daitchmokotoff' or 'dmsoundex'
-
All names matching according to the Daitch-Motokoff algorithm (see Text::Phonetic::DaitchMotokoff) will be returned. This alternative to soundex may be preferable for Slavic and Yiddish names.
- 'koeln'
-
All names matching according to the Koeln phonetic encoding (see Text::Phonetic::DaitchMotokoff) will be returned. This alternative to soundex may work better for German names, as well as for longer names.
- 'metaphone'
-
All names matching according to the Metaphone algorithm (see Text::Metaphone) will be returned. This is a method that strives to be "a modern version of soundex". It is also tuned towards English words.
- 'doublemetaphone'
-
All names matching according to the DoubleMetaphone algorithm (see Text::DoubleMetaphone) will be returned. This was developped as an improvement on Metaphone.
- 'levenshtein' or 'editdistance'
-
The closest names as calculated by the Levenshtein edit distance (see Text::Levenshtein) will be returned.
- 'levenshteindamerau' or 'damerau'
-
The closest names as calculated by the Levenshtein-Damerau edit distance (see Text::Levenshtein::Damerau) will be returned. This is an alternative to the edit distance defined by Levenshtein.
- 'jarowinkler'
-
The closest names as calculated by the Jaro-Winkler edit distance (see Text::JaroWinkler) will be returned. This is an alternative to the edit distance defined by Levenshtein.
- 'ngram'
-
The closest names as calculated by a comparison of trigrams (see String::Trigram) will be returned. (Future versions may include n-grams for n other than 3).
- 'fuzzy'
-
Currently, this is a synonym for
'levenshtein'
. This may change in the future. - a code ref
-
Reserved for future use.
- ...
-
Further fuzzy matchers may be added in the future according to interest.
- maxdist=...
-
For some matchers that define a metric on strings (like Levenshtein), this may specify the maximum allowable distance from the pattern specified. The default is half the length of the pattern. If 0 is specified, no limit will be applied. Note that, in array context, this may result in a large number of returned values. In scalar context, a non-null value (including the default value) may lead to no result being returned.
- maxsize=...
-
In array context, this may be used to specify the maximum number of values to return, in order to prevent flooding. There is no default. If 0 is specified, no limit will be applied. In scalar context, this parameter is disregarded.
- maxcodelen=...
-
(Used only for Metaphone) The original definition of the Metaphone algorithm uses a maximum of 4 characters for the encoded strings (just like Soundex). 4 is also used here by default. This parameter allows to set other values.
AUTHOR
Gisbert W. Selke, TapirSoft Selke & Selke GbR, <gws at cpan.org>
SEE ALSO
Map::Tube and the various Text::* modules referenced above
CONTRIBUTORS
Thanks to Mohammad S Anwar, author of Map::Tube, for that module, for great feedback, discussions, advice, debugging help, and willingness to refactor his code. Thanks to Slaven Rezic for extensive testing and valuable suggestions.
BUGS
Please report any bugs or feature requests to bug-map-tube-plugin-fuzzyfind at rt.cpan.org
, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=Map-Tube-Plugin-FuzzyFind. I will be notified and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Map::Tube::Plugin::FuzzyFind
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Map-Tube-Plugin-FuzzyFind
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
LICENSE AND COPYRIGHT
Copyright (C) 2015, 2024 Gisbert W. Selke, Tapirsoft Selke & Selke GbR
This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at:
http://www.perlfoundation.org/artistic_license_2_0
Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License.By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license.
If your Modified Version has been derived from a Modified Version made by someone other than you,you are nevertheless required to ensure that your Modified Version complies with the requirements of this license.
This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder.
This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement,then this Artistic License to you shall terminate on the date that such litigation is filed.
Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.