NAME
float.pl - Genetic programming front-end using Test::Float, StupidMarkov, and PPI
SYNOPSIS
float.pl is to Test::Float as prove is to Test::Harness. That is, float.pl is a command line interface to Test::Float.
perl float.pl --help
perl float.pl --learn /path/to/some/code
perl float.pl --spew 20
perl float.pl --code
WARNING! In the process of assimulating existing code and creating semi-random permutations from it, this script could easily come up with code that will ERASE YOUR DATA OR SEND INAPPROPRIATE PHOTOS TO YOUR INLAWS.
GLOSSARY
This has a number of parts. It's useful to define them before getting into arguments and usage. This also ships with a demo.
- Test::Float -- hacked up Test::Harness that understands floating point test results.
- float.pl -- this script; trains a Markov engine from code samples, generates semi-random random snippets, and applies a simple genetic programming algorithm using floating point test results as a fitness tests to the snippets
t/*
-- internal tests that have to pass before cpanm or whatever will install Test::Float; returnsok
/not ok
; uninterestingfitness-t/*
-- genetic selection criteria fitness criteria tests that returning floating point valuesfitness-t/goo.t
-- genetic selection criteria fitness tests that do some basic sanity checking such as looking for code that passes syntax checkfitness-t/logic.t
-- genetic selection critera fitness tests that inspect output on STDOUT; this test should be used as an example but otherwise REMOVED or ALTERED to be specific to test for whatever you wantfloat.pl --code
to write code to dogoo.pl
-- the primary output offloat.pl --code
; also the current member of the current generation of genetic-Markov code samples being tested byfloat.pl --code
; afterfloat.pl --code
finishes, the specimen with the best test score is left in place asgoo.pl
; the population of specimes exist primarily in memoryseq.pl
-- an example starting program to output a (kind of) Fibonacci sequence of numbers; it contaisn a bug (with a comment)
seq.pl
and fitness-t/logic.pl
, as shipped, are part of a demonstration in automatic bug repair. seq
attempts to compute (sort of) the Fibonacci sequence but contains a bug (with a comment marking it). fitness-t/logic.pl
tests for the correct output of the first three in the (kind of, simplified) Fibonacci sequence. float.pl --code --from seq.pl
should find and fix the bug in seq.pl
, leaving a corrected version of seq.pl
as goo.pl
. float.pl
is non-deterministic, so depending on luck, number of generations, and other parameters, may or may not arrive at a solution.
ARGUMENTS
Here are the arguments:
--learn <dir> -- feed .pl and .pm files in a directory into the Markov engine
--spew <n> -- (test) output n successive tokens from the Markov engine
--eval <str> -- (test) in-context eval; changes to the corpus are saved on exit
--code -- write a program to satisify tests
--code options:
--chainlength <n> -- number of tokens (program size) in each semi-randomly generated initial specimen OR:
--from <fn.pl> -- file to start with; implies learning from it as well as mutating it directly
--generations <n> -- how many generations to run, max (stops early on a perfect score)
--keep <n> -- how many top performers of the previous generation to include in each new generation
--breed <n> -- how many children of the top performers to include in each new generation
--mutate <n> -- how many mutated children of the top performers to include in each new generation
--new <n> -- how many brand new, semi-random specimen to include in each new generation
Acme::State is used to preserve program state between runs. If you tell it to --learn
a directory, it'll remember everything it has seen in there until you remove your ~/float.pl.state
file. This allows you to learn in one invocation and then generate code in another invocation.
--from
uses a program you provide as one of the first generation of specimens. This is what you want if you're using Test::Float to try to fix a bug for you in existing code rather than writing code from scratch.
--code
tells the thing to try to contrive a program that passes tests with the best score possible.
--code
requires a unit test that returns floating point values between 0 and 1 (inclusive) rather than ok
and not ok
. Genetic code specimens that do better are favored for preservation and breeding for next generations. Creating tests that describe the code you want written is critical. These live in the fitness-t/
directory.
--code
can be used one of two basic ways. With a --from
argument, it'll start from a pre-written script. It'll include an exact copy of that script in each generation, train the Markov engine from it, and generate an itinitial random population of similar number of tokens as it.
Without --form
, the initial random population are of --chainlength
tokens each, or 20 by default.
Currently, you need to cd
into the Test-Float-xx
directory to use the --code
operation, or else you need to copy or create a fitness-t/
directory with floating point tests. Either way, you need the fitness-t/
directory and fitness tests.
Two fitness test files ship with this thing, both in the fitness-t/
directory. The first fitness test, fitness-t/goo.t
, has tests to see that the program is at least a reasonable length long, passes syntax checks, isn't composed of too many comments, and a few other similar things. You may wish to keep this script as is, modify, or extend it.
The other fitness test, fitness-t/logic.t
should be used as an example or demonstration only and then commented out, removed, or completely rewritten and adapted to the purpose at hand. As shipped, it tests for the first three numbers (kind of) in the Fibonacci sequence.
Numerous times each generation -- once for each specimen -- goo.pl
is written out and the tests in fitness-t/
are run on it.
After --code
mode finishes running, the best contender will be left in place in goo.pl
.
BUGS
WARNING! In the process of assimulating existing code and creating semi-random permutations from it, this script could easily come up with code that will ERASE YOUR DATA OR SEND INAPPROPRIATE PHOTOS TO YOUR INLAWS.
I'm serious. This thing generates quasi-random code and then RUNS IT. This is STUPID.
In fact, this thing is STUPID in general -- nearly as a stupid as your average ASU undergrad. Far more intelligent genetic programming systems exist.
This program should be in Acme::.
AUTHOR
Scott Walters, <scott@slowass.net>
COPYRIGHT AND LICENSE
Copyright (C) 2010 by Scott Walters
This library is not free software; you can redistribute it and/or modify it provided you take my name off of it and accept or disclaim all responsibility for the horrible things it will inevitably do. By using this program, you agree not to use this program.
THIS PROGRAM MAKES NO WARRANTY OF FITNESS FOR ANY PURPOSE, INCLUDING THE PURPOSE OF NOT DELETING ALL OF YOUR DATA. This program is stupid and if you run it, so are you.
Do not email me and ask me to clarify the copyright license so you may include it in Debian. Let me save you the trouble: you may NOT include this program in Debian. You can include Test::Float itself, but you may not include this program.