NAME
extract-schemas - Extract test schemas from Perl modules
SYNOPSIS
extract-schemas [options] <module.pm>
Options:
--output-dir DIR Output directory for schema files (default: schemas/)
--strict-pod=off|warn|fatal
--verbose Show detailed analysis
--fuzz Run coverage-guided fuzzing on extracted schemas
--fuzz-iters N Iterations per method when fuzzing (default: 100)
(no short form, to avoid conflict with --fuzz/-f)
--fuzz-all Fuzz all methods, including those with no input schema
--corpus-dir DIR Directory to persist fuzz corpora (default: schemas/corpus/)
--help Show this help message
--man Show full documentation
Examples:
extract-schemas lib/MyModule.pm
extract-schemas --output-dir my_schemas --verbose lib/MyModule.pm
extract-schemas --fuzz lib/MyModule.pm
extract-schemas --fuzz --fuzz-iters 300 --corpus-dir t/corpus lib/MyModule.pm
extract-schemas --fuzz --fuzz-all lib/MyModule.pm
QUICK START
Run extract-schemas --strict-pod=warn -v --fuzz lib/MyModule.pm to analyse your module and automatically probe each method with hundreds of fuzzed inputs, looking for crashes caused by inputs that should be valid. Anything suspicious is saved to schemas/corpus/.
If genuine bugs are found, run fuzz-harness-generator --replay-corpus schemas/corpus/ -o t/fuzz_replay.t to turn them into regression tests that will fail until you fix the underlying code and pass forever after. Run extract-schemas --fuzz regularly - each run builds on the last, probing deeper into your code each time.
Otherwise, for each of the functions in MyModule.pm, fuzz-harness-generator -r schemas/function.yml
DESCRIPTION
This tool analyzes a Perl module and generates YAML schema files for each method, suitable for use with App::Test::Generator using the fuzz-harness-generator program which will create the .t file to run through prove.
The extractor uses three sources of information:
- 1. POD Documentation
-
Parses parameter descriptions from POD to extract types and constraints.
- 2. Code Analysis
-
Analyzes validation patterns in the code (ref checks, length checks, etc.)
- 3. Method Signatures
-
Extracts parameter names from method signatures.
The tool assigns a confidence level (high/medium/low) to each schema based on how much information it could infer.
FUZZING
When --fuzz is specified, the tool will additionally run App::Test::Generator::CoverageGuidedFuzzer against each method after schema extraction.
By default all methods with at least one known input parameter are fuzzed, regardless of confidence level. Use --fuzz-all to also attempt fuzzing methods with no input schema (these will use purely random generation).
The fuzzer will:
Load and
requirethe target module at runtimeRun coverage-guided fuzzing using the extracted schema as input spec
Report any crashes or unexpected errors found
Persist a corpus to
--corpus-dirfor incremental improvement across runs
Corpus files are named <corpus-dir>/<method>.json and are automatically loaded on subsequent runs, so each run builds on the last.
SCHEMA FORMAT
The generated YAML files have the following structure:
method: method_name
confidence: high|medium|low
notes:
- Any warnings or suggestions
input:
param_name:
type: string|integer|number|boolean|arrayref|hashref|object
min: 5
max: 100
optional: 0
matches: /pattern/
CONFIDENCE LEVELS
- high
-
Strong evidence from POD and code analysis. Schema should be accurate.
- medium
-
Partial information available. Review recommended.
- low
-
Limited information. Manual review required.
EXAMPLES
Basic Usage
extract-schemas lib/MyModule.pm
Fuzz methods with known inputs
extract-schemas --fuzz lib/MyModule.pm
Fuzz everything, 300 iterations, custom corpus dir
extract-schemas --fuzz --fuzz-all --fuzz-iters 300 --corpus-dir t/corpus lib/MyModule.pm
Incremental fuzzing (corpus grows across runs)
# First run: builds initial corpus
extract-schemas --fuzz lib/MyModule.pm
# Subsequent runs: loads corpus and extends it
extract-schemas --fuzz lib/MyModule.pm
Verbose Mode
extract-schemas --verbose lib/MyModule.pm
Pod Checking
--strict-pod=LEVEL
off - do not validate POD
warn - warn on mismatches (default)
fatal - abort on mismatches
NEXT STEPS
After extracting schemas:
1. Review the generated YAML files, especially those marked low confidence 2. Edit the schemas to add missing information or correct errors 3. Use the schemas with App::Test::Generator:
test-generator --schema schemas/my_method.yaml
SEE ALSO
App::Test::Generator, App::Test::Generator::CoverageGuidedFuzzer, PPI, Pod::Simple
AUTHOR
Nigel Horne