NAME
Lingua::Awkwords - randomly generates outputs from a given pattern
SYNOPSIS
use feature qw(say);
use Lingua::Awkwords;
use Lingua::Awkwords::Subpattern;
# V is a pre-defined subpattern, ^ filters out aa from the list
# of two vowels that the two VV generate
my $la = Lingua::Awkwords->new( pattern => q{ [VV]^aa } );
say $la->render for 1..10;
# define our own C, V
Lingua::Awkwords::Subpattern->set_patterns(
C => [qw/j k l m n p s t w/],
V => [qw/a e i o u/],
);
# and a pattern somewhat suitable for Toki Pona...
$la->pattern(q{
[a/*2]
(CV*5)^ji^ti^wo^wu
(CV*2)^ji^ti^wo^wu
[CV/*2]^ji^ti^wo^wu
[n/*5]
});
say $la->render for 1..10;
DESCRIPTION
This is a Perl implementation of
http://akana.conlang.org/tools/awkwords/
though is not an exact replica of that parser;
http://akana.conlang.org/tools/awkwords/help.html
details the format that this code is based on. Briefly,
SYNTAX
- [] or ()
-
Denote a unit or group; they are identical except that
(a)
is equivalent to[a/]
--that is, it represents the possibility of generating the empty string in addition to any other terms supplied.Units can be nested recursively. There is an implicit unit at the top level of the pattern.
- /
-
Introduces a choice within a unit; without this
[Vx]
would generate whateverV
represents (a list of vowels by default) followed by the letterx
while[V/x]
by contrast generates only a vowel or the letterx
. - *
-
The asterisk followed by an integer in the range
1..128
inclusive weights the current term of the alternation, if any. That is, while[a/]
generates each term with equal probability,[a/*2]
would generate the empty string at twice the probability of the lettera
. - ^
-
The caret introduces a filter that must follow a unit (there is an implicit unit at the top level of a pattern). An example would be
[VV]^aa
or the equivalentVV^aa
that (by default) generates two vowels, but replacesaa
with the empty string. More than one filter may be specified. - A-Z
-
Capital ASCII letters denote subpatterns; several of these are set by default. See Lingua::Awkwords::Subpattern for how to customize them.
V
for example is by default equivalent to the more verbose[a/i/u]
. - "
-
Use double quotes to denote a quoted string; this prevents other characters (besides
"
itself) from being interpreted as some non- string value. - anything-else
-
Anything else not otherwise accounted for above is treated as part of a string, so
["abc"/abc]
generates either the stringabc
or the stringabc
, as this is two ways of saying the same thing.
ATTRIBUTES
- pattern
-
Awkword pattern. Without this supplied any call to render will throw an exception.
- tree
-
Where the parse tree is stored.
METHODS
- new
-
Constructor. Typically this should be passed a pattern argument.
- parse_string pattern
-
Returns the parse tree of the given pattern without setting the tree attribute. "COMPLICATIONS" shows one use for this.
- render
-
Returns a string render of the awkword pattern. This may be the empty string if filters have removed all the text.
- walk callback
-
Provides a means to recurse through the parse tree, where every object in the tree will call the callback with
$self
as the sole argument, and then if necessary iterate through all of the possibilities contained by itself calling walk on each of those.
COMPLICATIONS
More complicated structures can be built by attaching parse trees to subpatterns. For example, Toki Pona could be extended to allow optional diphthongs (mostly in the second syllable) via
use feature qw(say);
use Lingua::Awkwords::Subpattern;
use Lingua::Awkwords;
my $cv = Lingua::Awkwords->parse_string(q{
CV^ji^ti^wo^wu
});
my $cvv = Lingua::Awkwords->parse_string(q{
CVV^ji^ti^wo^wu^aa^ee^ii^oo^uu
});
Lingua::Awkwords::Subpattern->set_patterns(
A => $cv,
B => $cvv,
C => [qw/j k l m n p s t w/],
V => [qw/a e i o u/],
);
my $tree = Lingua::Awkwords->new( pattern => q{
[ a[B/BA/BAA/A/AA/AAA] / [AB/ABA/ABAA/A/AA/AAA] ] [n/*5]
});
say join ' ', map { $tree->render } 1 .. 10;
The default filter of the empty string can be problematical, as one may not know whether a filter has been applied to the result, or the word may be filtered into an incorrect form. The above trees with filters can be modified as follows
$tree->walk( set_filter('X') );
# more or less the equivalent of a let-over-lambda in LISP
sub set_filter {
my $filter = shift;
return sub {
my $self = shift;
$self->filter_with($filter) if $self->can('filter_with');
};
}
to instead replace filtered values with X
and then enough words generated minus those filtered via
my @words;
while (1) {
my $possible = $tree->render;
next if $possible =~ m/X/;
push @words, $possible;
last if @words >= 10;
}
say join ' ', @words;
BUGS
Reporting Bugs
Please report any bugs or feature requests to bug-lingua-awkwords at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-Awkwords.
Patches might best be applied towards:
https://github.com/thrig/Lingua-Awkwords
Known Issues
There are various incompatibilities with the original version of the code; these are detailed in the parser module as they concern how e.g. weights are parsed.
See also the "Known Issues" section in all the other modules in this distribution.
SEE ALSO
Lingua::Awkwords::ListOf, Lingua::Awkwords::OneOf, Lingua::Awkwords::Parser, Lingua::Awkwords::String, Lingua::Awkwords::Subpattern
AUTHOR
thrig - Jeremy Mates (cpan:JMATES) <jmates at cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2017 by Jeremy Mates
This program is distributed under the (Revised) BSD License: http://www.opensource.org/licenses/BSD-3-Clause