NAME
SuffixTree - Efficient string manipulation data structure interface for Perl.
SYNOPSIS
use SuffixTree;
my $str = "mississippi";
my $tree=create_tree($str);
print_tree($tree);
my $position = find_substring($tree, "ssis");
printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);
delete_tree($tree); # NOTICE: this method will soon become deprecated
DEPRECATED SYNOPSIS
use SuffixTree;
my $str = "mississippi";
my $tree=ST_CreateTree($str, length($str));
ST_PrintTree($tree);
my $position = ST_FindSubstring($tree, "ssis", 4);
printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);
ST_DeleteTree($tree);
DESCRIPTION
The intention of this project is to provide an open-source implementation for an efficient data structure for strings manipulation - the Suffix Tree.
The code was written with as much consistency with the theoretical algorithm as possible (see references). It provides a set of interface functions for creating and searching the tree.
The suffix tree is implemented in ANSI-C. The code is written based on the original algorithm by E.Ukkonen. This is the Perl interface for the underlying ANSI-C implementation.
FUNCTIONS
All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will definitly address in the future.
- $tree = create_tree($string)
-
Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string. Returns a reference to the tree.
- $position = find_substring($tree, $substring)
-
Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for. Returns the 1-based position it was found in the source string or 0 if string is not in the tree.
- print_tree($tree)
-
Prints the tree. Parameters: the tree to print.
- delete_tree($tree)
-
Deletes a suffix tree. Parameters: the tree to delete. Returns : void.
DEPRECATED FUNCTIONS
All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will try to address in the future.
- $tree = ST_CreateTree($string, length($string))
-
Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string, length of the string. Returns a reference to the tree.
- $position = ST_FindSubstring($tree, $substring, length($substring))
-
Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for, length of substring. Returns the 1-based position it was found in the source string or 0 if string is not in the tree.
- ST_PrintTree($tree)
-
Prints the tree. Parameters: the tree to print.
- ST_DeleteTree($tree)
-
Deletes a suffix tree. Parameters: the tree to delete. Returns : void.
BUGS
This Perl interface was mostly built automatically (using SWIG). Little to no attention was given to testing. In future relases of this Perl Module (along with its underlying ANSI-C implementation) we hope to fix all problems that might currenly interfere with successful usage of this module. Please send bug reports to the current maintainer(s) of this module.
FUTURE WORK
[1] A better Perl-ish interface
[2] Building tests for this module (for the `make test` part of the installation)
[3] Object Oriented like usage
PORTABILITY
Please read the README file for information.
SEE ALSO
L<https://github.com/daoswald/SuffixTree.git> - This module's Github
repository.
L<http://en.wikipedia.org/wiki/Suffix_tree> - Wikipedia's Suffix Tree
explanation.
AUTHOR
Shlomo Yona <yona@cs.technion.ac.il> is the original author.
David Oswald <davido@cpan.org> is the current maintainer.
THANKS TO
Offer Kaye for useful ideas and support.
COPYRIGHT
Copyright (c) 2002, 2003, 2012 Shlomo Yona. All rights reserved.
This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.