NAME

String::Blender - flexible vocabulary-based generator of compound words (e.g. domain names).

VERSION

This document describes String::Blender version 0.04

SYNOPSIS

use String::Blender;

my $blender = String::Blender->new(
    vocab_files => [
        './vocab/hacker-jargon.txt',  # load into vocab #0
        [
            './vocab/places.txt',     # load both files
            './vocab/boosters.txt',   # into vocab #1
        ]
    ],
    quantity => 10,
    max_length => 20,
    max_elements => 3,
    postfix => '.com',
);

my @result = $blender->blend;

# The @result will look like this:
# (
#      'tastybitshandler.com',
#      'bubblesortcore.com',
#      'regexpkingdom.com',
#      'bigslashbase.com',
#      'powerslurp.com',
#      'pipestacklabel.com',
#      'metaspoofzone.com',
#      'randomsubshell.com',
#      'forehandleroot.com',
#      'pragmaware.com'
# );

# Vocabularies can be also specified directly, e.g.:
my $blender = String::Blender->new(
    vocabs => [
        [qw/web net host site list archive core base switch/],
        [qw/candy honey muffin sugar sweet yammy/],
        [qw/area city club dominion empire field land valley world/],
    ],
    strict_order => 1,
    min_elements => 3,
    max_elements => 3,
    max_length => 20,
    delimiter => "-",
);

my @result = $blender->blend(5);

# Then the @result will look like this:
# (
#      'base-honey-field',
#      'list-candy-dominion',
#      'web-sugar-land',
#      'archive-muffin-field',
#      'web-yammy-area'
# );

DESCRIPTION

String::Blender is an OO implementation of random generator of compound words based on one or more priority driven word vocabularies. Originally the module was created for the purpose of constructing new attractive thematic domain names. Later it was used to improve dictionary attack tool.

Each vocabulary itself represents an array of single words not necessarily sorted. All vocabularies are stored in an array within predefined order. String::Blender provides ability to load vocabularies from plain text files or set them manually.

Resulting compound words are represented as an array of uniq strings which consist of one or more vocabulary words placed in serial or random order; probably prefixed, followed and/or separated by defined strings.

Construction of one compound word can be briefly described like this:

  • Define random number of elements within a given set of constraints.

  • Address each vocabulary list in a row up to the defined number of elements and take one random word per vocabulary. Once the number of future component words exceeds the number of vocabularies, then take each next word from random vocabulary.

  • Concatenate selected words and/or join them with delimiter, add prefix and postfix if defined.

  • Check the length of the resulting word. Retry attempt if it's too long or too short.

SUBROUTINES/METHODS

Class methods

  • new (%config)

    The new constructor method instantiates a new String::Blender object. A hash array of configuration attributes may be passed as a parameter. See the </ATTRIBUTES> section.

Object methods

  • blend ($quantity)

    Generates and returns list of $quantity or less compound words in the manner explained in "DESCRIPTION" accordingly to constraints and options being set as the object attributes described below. If $quantity is omitted, then value of the object attribute with the same name will be used.

  • load_vocabs

    Loads vocabulary lists from plain text files collecting one element per line and stores the "vocabs" attribute. Takes lists of files from the "vocab_files" attribute. Returns number of vocabularies loaded. Note that this method invokes automatically after object creation if "vocabs" is empty and after each setting of the "vocab_files" attribute, so you will not have to call it manually.

  • BUILD

    Normally, you will not have to invoke this method directly, but you might want to override it. The BUILD method is called after the object is constructed and in the String::Blender object it attempts to load vocabularies from files specified in the "vocab_files" attribute when no vocabularies provided directly through the "vocabs" attribute.

CONFIGURATION AND ENVIRONMENT

The following list gives a short summary of each String::Blender object attribute. All of them can be defined on object creation (see "new") or set separately like follows.

$blender->max_elements(30);
$blender->vocabs(\@my_vocabs);

Vocabularies

  • vocabs

    Contains reference to an array of vocabularies. Each vocabulary is represented by a reference to an array of strings, one per element. Any of those strings should not be empty and should not contain newlines and control characters. Being left undefined on object creation, this attribute will be set by the "load_vocabs" method automatically. In this case you are supposed to have the "vocab_files" attribute set properly.

  • vocab_files

    Defines filenames and lists of filenames to read vocabularies from. Contains reference to an array of filenames and/or references to arrays of filenames. The "load_vocabs" method will merge vocabularies loaded from united filenames into a single vocabulary. After object creation this method will be invoked every time the "vocab_files" attribute is set. Each vocabulary file should consist of word per line in plain text format.

Constraints

  • min_length, max_length

    Define the minimum and the maximum length in characters of the resulting string. Positive integers, dafault: 5 and 20 respectively.

  • min_elements, max_elements

    Define the minimum and the maximum number of elements the resulting string should consist of. Positive integers, dafault: 2 and 5 respectively.

  • max_tries_factor

    Defines the maximum number of generation loops per </blend> as the product of </quantity> and max_tries_factor values. Positive integer, dafault: 4. For example, if the </quantity> equals to 10, the number of generation loops will be limited to 40.

Options

  • quantity

    Defines the quantity of strings to be generated per one invocation of the "blend" method. Positive integer, default: 10.

  • strict_order

    Concatenate string elements according to the strict order of vocabularies they were taken from. Boolean, default: false.

  • delimiter

    String to separate string elements with in each resulting string. Empty by default.

  • prefix

    String to prefix each resulting string with. Empty by default.

  • postfix

    String to follow each resulting string by. Empty by default.

DIAGNOSTICS

There are some exceptional situations worth consideration.

Maximum tries limit exceeded (%s)

Normally the size of resulting list returned by the "blend" method should be equal to $quantity. But having in mind that the method is intended to provide a list of unique strings within certain restrictions, it becomes clear that in some conditions there is a chance to fall into infinite loop. That's what the "max_tries_factor" limitation attribute stands for. When the generator runs into narrow constraints and/or poor vocabularies, the resulting list may turn out to be shoter then expected or even empty. In this case relevant warning will follow. In order to avoid this you might want to increase value of the "max_tries_factor" attribute or weaken generation constraints such as "min_elements", "max_elements", "min_length", "max_length".

There are no vocabulary files specified

The load_vocabs method will die once the "vocab_files" attribute is not defined or refers to an empty list.

Could not open (close) file %s

"load_vocabs" will also die being unable to open any file specified in the "vocab_files" attribute.

Attribute (%s) does not pass the type constraint because: %s

Assigning any object attribute to a value which does not match the attribute's type constraints will cause relevant fatal error.

DEPENDENCIES

String::Blender depends on the Moose object system (version 0.74 or newer) which must be installed separately.

INCOMPATIBILITIES

None reported.

BUGS AND LIMITATIONS

No bugs have been reported. The API is not stable yet and can be changed in future.

Please report any bugs or feature requests to bug-string-blender@rt.cpan.org, or through the web interface at http://rt.cpan.org.

AUTHOR

Alexey Skorikov <alexey@skorikov.name>

LICENSE AND COPYRIGHT

Copyright (c) 2009, Alexey Skorikov <alexey@skorikov.name>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.