NAME
Mock::Populate - Mock data creation
VERSION
version 0.0901
SYNOPSIS
use Mock::Populate;
# * Call each function below with Mock::Populate::foo(...
$ids = number_ranger(start => 1, end => 1001, prec => 0, random => 0, N => $n);
$money = number_ranger(start => 1000, end => 5000, prec => 2, random => 1, N => $n);
$create = date_ranger(start => '1900-01-01', end => '2020-12-31', N => $n);
$modify = date_modifier($offset, @$create);
$times = time_ranger(stamp => 1, start => '01:02:03' end =>'23:59:59', N => $n);
$people = name_ranger(gender => 'b', names => 2, country => 'us', N => $n);
$email = email_ranger(@$people);
$shuff = shuffler($n, qw(foo bar baz goo ber buz));
$stats = distributor(type => 'u', prec => 4, dof => 2, N => $n);
$string = string_ranger(length => 32, type => 'base64', N => $n);
$imgs = image_ranger(size => 10, N => $n); # *size is density, not pixel dimension
$coll = collate($ids, $people, $email, $create, $times, $modify, $times);
DESCRIPTION
This is a set of functions for mock data creation.
No functions are exported, so use the entire Mock::Populate::*
namespace when calling each.
Each function produces a list of elements that can be used as database columns. The handy collate()
function takes these columns and returns a list of (arrayref) rows. This can then be processed into CSV, JSON, etc. It can also be directly inserted into your favorite database, with your favorite perl ORM.
FUNCTIONS
date_ranger()
$results = date_ranger(start => $start, end => $end, N => $n);
Return a list of N random dates within a range. The start and end dates and desired number of data-points arguments are all optional. The defaults are:
start: 2000-01-01
end: today (computed if not given)
N: 10
The dates must be given as YYYY-MM-DD strings.
date_modifier()
$modify = date_modifier($offset, @$dates);
Returns a new list of random future dates, based on the offset, and respective to each given date.
time_ranger()
$results = time_ranger(
stamp => $stamp, start => $start, end => $end,
N => $n);
Return a list of N random times within a range. The start and end times and desired number of data-points arguments are all optional. The defaults are:
stamp: 1 (boolean)
start: 00-00-00
end: now (computed if not given)
N: 10
The times must be given as HH-MM-SS strings.
number_ranger()
$results = number_ranger(
start => $start, end => $end,
prec => $prec, random => $random,
N => $n)
Return a list of N random numbers within a range. The start, end, precision, whether we want random or sequential numbers and desired number of data-points arguments are all optional. The defaults are:
start: 0
end: 9
precision: 2
random: 1
N: 10
name_ranger()
$results = name_ranger(
gender => $gender, names => $names, country => $country,
N => $n)
Return a list of N random person names. The gender, number of names and desired number of data-points arguments are all optional. The defaults are:
gender: b (options: both, female, male)
names: 2 (first, last)
country: us
N: 10
email_modifier()
$results = email_modifier(@people)
# first.last@example.{com,net,org,edu}
Return a list of N email addresses based on a list of given names.
distributor()
$results = distributor(type => $type, prec => $prec, dof => $dof, N => $n)
Return a list of N distribution values. The type, precision, degrees-of-freedom and desired number of data-points arguments are optional. The defaults are:
type: u (normal)
precision: 2
degrees-of-freedom: 2
N: 10
Types
This function uses single letter identifiers:
u: Normal distribution (default)
c: Chi-squared distribution
s: Student's T distribution
f: F distribution
Degrees of freedom
Given the type, this function accepts the following:
c: A single integer
s: A single integer
f: A fraction string of the form 'N/D' (default 2/1)
shuffler()
$results = shuffler($n, @items)
Return a shuffled list of $n items. The items and number of data-points arguments are optional. The defaults are:
n: 10
items: a b c d e f g h i j
string_ranger()
$results = string_ranger(type => $type, length => $length, N => $n)
Return a list of N strings. The strings and number of data-points arguments are optional. The defaults are:
type: default
length: 8
N: 10
* This function is nearly identical to the Data::SimplePassword rndpassword
program, but allows you to generate a finite number of results.
Types
Types Output sample Character set
___________________________________________________
default 0xaVbi3O2Lz8E69s 0..9 a..z A..Z
ascii n:.T<Gr!,e*[k=eu visible ascii
base64 PC2gb5/8+fBDuw+d 0..9 a..z A..Z / +
path PC2gb5/8.fBDuw.d 0..9 a..z A..Z / .
simple xek4imbjcmctsxd3 0..9 a..z
hex 89504e470d0a1a0a 0..9 a..f
alpha femvifzscyvvlwvn a..z
pron werbucedicaremoz a..z but pronounceable!
digit 7563919623282657 0..9
binary 1001011110000101
morse -.--...-.--.-..-
image_ranger()
$results = image_ranger(size => $size, N => $n)
Return a list of N 1x1 pixel images of varying byte sizes (not image dimension). The byte size and number of data-points are both optional.
The defaults are:
N: 10
size: 8
collate()
$rows = collate(@columns)
Return a list of lists representing a 2D table of rows, given the lists provided, with each member added to a row, respectively.
SEE ALSO
Data::Random does nearly the exact same thing. Whoops!
TO DO
Implement dirty-data randomizing.
unexpected formats: iso-8859-1, utf-16, windows codepage,
BOM (byte order marker),
broken unicode,
garbled binary,
\r and \n variations,
commas or $ in currencies ("format fuckups"),
bad JSON,
broken XML,
bad ' and " in CSV,
statistical outliers,
time-series drops and spikes,
duplicate data,
missing data,
truncated data,
AUTHOR
Gene Boggs <gene@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Gene Boggs.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.