NAME

SuffixTree - Efficient string manipulation data structure interface for Perl.

SYNOPSIS

use SuffixTree;

my $str = "mississippi";
my $tree=create_tree($str);
print_tree($tree);
my $position = find_substring($tree, "ssis");

printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);

delete_tree($tree); # NOTICE: this method will soon become deprecated 

DEPRECATED SYNOPSIS

use SuffixTree;

my $str = "mississippi";
my $tree=ST_CreateTree($str, length($str));
ST_PrintTree($tree);
my $position = ST_FindSubstring($tree, "ssis", 4);

printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);

ST_DeleteTree($tree);

DESCRIPTION

The intention of this project is to provide an open-source implementation for an efficient data structure for strings manipulation - the Suffix Tree.

The code was written with as much consistency with the theoretical algorithm as possible (see references). It provides a set of interface functions for creating and searching the tree.

The suffix tree is implemented in ANSI-C. The code is written based on the original algorithm by E.Ukkonen. This is the Perl interface for the underlying ANSI-C implementation.

FUNCTIONS

All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will definitly address in the future.

$tree = create_tree($string)

Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string. Returns a reference to the tree.

$position = find_substring($tree, $substring)

Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for. Returns the 1-based position it was found in the source string or 0 if string is not in the tree.

Prints the tree. Parameters: the tree to print.

delete_tree($tree)

Deletes a suffix tree. Parameters: the tree to delete. Returns : void.

DEPRECATED FUNCTIONS

All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will try to address in the future.

$tree = ST_CreateTree($string, length($string))

Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string, length of the string. Returns a reference to the tree.

$position = ST_FindSubstring($tree, $substring, length($substring))

Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for, length of substring. Returns the 1-based position it was found in the source string or 0 if string is not in the tree.

ST_PrintTree($tree)

Prints the tree. Parameters: the tree to print.

ST_DeleteTree($tree)

Deletes a suffix tree. Parameters: the tree to delete. Returns : void.

BUGS

This Perl interface was mostly built automatically (using SWIG). Little to no attention was given to testing. In future relases of this Perl Module (along with its underlying ANSI-C implementation) we hope to fix all problems that might currenly interfere with successful usage of this module. Please send bug reports to the current maintainer(s) of this module.

FUTURE WORK

[1] A better Perl-ish interface

[2] Building tests for this module (for the `make test` part of the installation)

[3] Object Oriented like usage

PORTABILITY

Please read the README file for information.

SEE ALSO

L<https://github.com/daoswald/SuffixTree.git> - This module's Github
repository.

L<http://en.wikipedia.org/wiki/Suffix_tree> - Wikipedia's Suffix Tree
explanation.

AUTHOR

Shlomo Yona <yona@cs.technion.ac.il> is the original author.

David Oswald <davido@cpan.org> is the current maintainer.

THANKS TO

Offer Kaye for useful ideas and support.

COPYRIGHT

Copyright (c) 2002, 2003, 2012 Shlomo Yona. All rights reserved.

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.