NAME
Math::String::Charset::Wordlist - A dictionary charset for Math::String
SYNOPSIS
use Math::String::Charset::Wordlist;
my $x = Math::String::Charset::Wordlist->new ( {
file => 'path/dictionary.lst' } );
REQUIRES
perl5.005, DynaLoader, Math::BigInt, Math::String::Charset
EXPORTS
Exports nothing.
DESCRIPTION
This module lets you create an charset object, which is used to construct Math::String objects.
This object maps an external wordlist (aka a dictionary file where one line contains one word) to a simple charset, e.g. each word is one character in the charset.
The wordlist file must be sorted alphabetically (just like sort -u
does), otherwise the results from converting between string and number form are unpredictable.
ERORRS
Upon error, the field _error
stores the error message, then die() is called with this message. If you do not want the program to die (f.i. to catch the errors), then use the following:
use Math::String::Charset::Wordlist;
$Math::String::Charset::Wordlist::die_on_error = 0;
$a = Math::String::Charset::Wordlist->new(); # error, empty set!
print $a->error(),"\n";
INTERNAL DETAILS
This object caches certain calculation results (f.i. which word is stored at which offset in the file etc), thus greatly speeding up sequentiell Math::String conversations from string to number, and vice versa.
METHODS
- new()
-
Math::String::Charset::Wordlist->new();
Create a new Math::String::Charset::Wordlist object.
The constructor takes a HASH reference. The following keys can be used:
minlen Minimum string length, for now always 0 maxlen Maximum string length, for now always 1 file path/filename of wordlist file sep separator character, none if undef
The resulting charset will always be of order 1, type 2.
The wordlist file must be sorted alphabetically (just like
sort -u
does), otherwise the results from converting between string and number form are unpredictable.- minlen
-
Optional minimum string length. Any string shorter than this will be invalid. Must be shorter than a (possible defined) maxlen. If not given is set to -inf. Note that the minlen might be adjusted to a greater number, if it is set to 1 or greater, but there are not valid strings with 2,3 etc. In this case the minlen will be set to the first non-empty class of the charset.
For wordlists, the minlen is always 0 (thus making '' the first valid string).
- maxlen
-
Optional maximum string length. Any string longer than this will be invalid. Must be longer than a (possible defined) minlen. If not given is set to +inf.
For wordlists, the maxlen is always 1 (thus making the last word in the dictionary the last valid string).
- minlen()
-
$charset->minlen();
Return minimum string length.
- maxlen()
-
$charset->maxlen();
Return maximum string length.
- length()
-
$charset->length();
Return the number of items in the charset, for higher order charsets the number of valid 1-character long strings. Shortcut for
$charset->class(1)
. - count()
-
Returns the count of all possible strings described by the charset as a positive BigInt. Returns 'inf' if no maxlen is defined, because there should be no upper bound on how many strings are possible.
If maxlen is defined, forces a calculation of all possible class() values and may therefore be very slow on the first call, it also caches possible lot's of values if maxlen is very high.
- class()
-
$charset->class($order);
Return the number of items in a class.
print $charset->class(5); # how many strings with length 5?
- char()
-
$charset->char($nr);
Returns the character number $nr from the set, or undef.
print $charset->char(0); # first char print $charset->char(1); # second char print $charset->char(-1); # last one
- lowest()
-
$charset->lowest($length);
Return the number of the first string of length $length. This is equivalent to (but much faster):
$str = $charset->first($length); $number = $charset->str2num($str);
- highest()
-
$charset->highest($length);
Return the number of the last string of length $length. This is equivalent to (but much faster):
$str = $charset->first($length+1); $number = $charset->str2num($str); $number--;
- order()
-
$order = $charset->order();
Return the order of the charset: is always 1 for grouped charsets. See also type.
- type()
-
$type = $charset->type();
Return the type of the charset: is always 1 for grouped charsets. See also order.
- charlen()
-
$character_length = $charset->charlen();
Return the length of one character in the set. 1 or greater. All charsets used in a grouped charset must have the same length, unless you specify a seperator char.
- seperator()
-
$sep = $charset->seperator();
Returns the separator string, or undefined if none is used.
- chars()
-
$chars = $charset->chars( $bigint );
Returns the number of characters that the string would have, when you would convert $bigint (Math::BigInt or Math::String object) back to a string. This is much faster than doing
$chars = length ("$math_string");
since it does not need to actually construct the string.
- first()
-
$charset->first( $length );
Return the first string with a length of $length, according to the charset. See
lowest()
for the corrospending number. - last()
-
$charset->last( $length );
Return the last string with a length of $length, according to the charset. See
highest()
for the corrospending number. - is_valid()
-
$charset->is_valid();
Check wether a string conforms to the charset set or not.
- error()
-
$charset->error();
Returns "" for no error or an error message that occured if construction of the charset failed. Set
$Math::String::Charset::die_on_error
to0
to get the error message, otherwise the program will die. - start()
-
$charset->start();
In list context, returns a list of all characters in the start set, that is the ones used at the first string position. In scalar context returns the lenght of the start set.
Think of the start set as the set of all characters that can start a string with one or more characters. The set for one character strings is called ones and you can access if via
$charset->ones()
. - end()
-
$charset->end();
In list context, returns a list of all characters in the end set, aka all characters a string can end with. In scalar context returns the lenght of the end set.
- ones()
-
$charset->ones();
In list context, returns a list of all strings consisting of one character. In scalar context returns the lenght of the ones set.
This list is the cross of start and end.
Think of a string of only one character as if it starts with and ends in this character at the same time.
The order of the chars in
ones
is the same ordering as instart
. - prev()
-
$string = Math::String->new( ); $charset->prev($string);
Give the charset and a string, calculates the previous string in the sequence. This is faster than decrementing the number of the string and converting the new number to a string. This routine is mainly used internally by Math::String and updates the cache of the given Math::String.
- next()
-
$string = Math::String->new( ); $charset->next($string);
Give the charset and a string, calculates the next string in the sequence. This is faster than incrementing the number of the string and converting the new number to a string. This routine is mainly used internally by Math::String and updates the cache of the given Math::String.
- file()
-
$file = $charset->file();
Return the path/name of the dictionary file beeing used in constructing this character set.
- num2str()
-
my ($string,$length) = $charset->num2str($number);
Converts a Math::BigInt/Math::String to a string. In list context it returns the string and the length, in scalar context only the string.
- str2num()
-
$number = $charset->str2num($str);
Converts a string (literal string or Math::String object) to the corrosponding number form (as Math::BigInt).
- offset()
-
my $offset = $charset->offset($number);
Returns the offset of the n'th word into the dictionary file.
EXAMPLES
use Math::String;
use Math::String::Charset::Wordlist;
my $cs =
Math::String::Charset::Wordlist->new( { file => 'big.sorted' } );
my $x =
Math::String->new('',$cs)->binc(); # $x is now the first word
while ($x < Math::BigInt->new(10)) # Math::BigInt->new() necc.!
{
# print the first 10 words
print $x++,"\n";
}
BUGS
Please report any bugs or feature requests to bug-math-string-charset-wordlist at rt.cpan.org
, or through the web interface at https://rt.cpan.org/Ticket/Create.html?Queue=Math-String-Charset-Wordlist (requires login). We will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Math::String::Charset::Wordlist
You can also look for information at:
RT: CPAN's request tracker
https://rt.cpan.org/Dist/Display.html?Name=Math-String-Charset-Wordlist
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
http://cpanratings.perl.org/dist/Math-String-Charset-Wordlist
Search CPAN
CPAN Testers Matrix
http://matrix.cpantesters.org/?dist=Math-String-Charset-Wordlist
AUTHOR
If you use this module in one of your projects, then please email me. I want to hear about how my code helps you ;)
This module is (C) Copyright by Tels http://bloodgate.com 2003-2008.
Copyright 2017- Peter John Acklam pjacklam@online.no.