NAME

extract-schemas - Extract test schemas from Perl modules

SYNOPSIS

extract-schemas [options] <module.pm>

Options:
  --output-dir DIR    Output directory for schema files (default: schemas/)
  --strict-pod=off|warn|fatal
  --verbose           Show detailed analysis
  --fuzz              Run coverage-guided fuzzing on extracted schemas
  --fuzz-iters N      Iterations per method when fuzzing (default: 100)
                      (no short form, to avoid conflict with --fuzz/-f)
  --fuzz-all          Fuzz all methods, including those with no input schema
  --corpus-dir DIR    Directory to persist fuzz corpora (default: schemas/corpus/)
  --help              Show this help message
  --man               Show full documentation

Examples:
  extract-schemas lib/MyModule.pm
  extract-schemas --output-dir my_schemas --verbose lib/MyModule.pm
  extract-schemas --fuzz lib/MyModule.pm
  extract-schemas --fuzz --fuzz-iters 300 --corpus-dir t/corpus lib/MyModule.pm
  extract-schemas --fuzz --fuzz-all lib/MyModule.pm

QUICK START

Run extract-schemas --strict-pod=warn -v --fuzz lib/MyModule.pm to analyse your module and automatically probe each method with hundreds of fuzzed inputs, looking for crashes caused by inputs that should be valid. Anything suspicious is saved to schemas/corpus/.

If genuine bugs are found, run fuzz-harness-generator --replay-corpus schemas/corpus/ -o t/fuzz_replay.t to turn them into regression tests that will fail until you fix the underlying code and pass forever after. Run extract-schemas --fuzz regularly - each run builds on the last, probing deeper into your code each time.

Otherwise, for each of the functions in MyModule.pm, fuzz-harness-generator -r schemas/function.yml

DESCRIPTION

This tool analyzes a Perl module and generates YAML schema files for each method, suitable for use with App::Test::Generator using the fuzz-harness-generator program which will create the .t file to run through prove.

The extractor uses three sources of information:

1. POD Documentation: Parses parameter descriptions from POD to extract types and constraints.
2. Code Analysis: Analyzes validation patterns in the code (ref checks, length checks, etc.)
3. Method Signatures: Extracts parameter names from method signatures.

The tool assigns a confidence level (high/medium/low) to each schema based on how much information it could infer.

FUZZING

When --fuzz is specified, the tool will additionally run App::Test::Generator::CoverageGuidedFuzzer against each method after schema extraction.

By default all methods with at least one known input parameter are fuzzed, regardless of confidence level. Use --fuzz-all to also attempt fuzzing methods with no input schema (these will use purely random generation).

The fuzzer will:

Load and require the target module at runtime
Run coverage-guided fuzzing using the extracted schema as input spec
Report any crashes or unexpected errors found
Persist a corpus to --corpus-dir for incremental improvement across runs

Corpus files are named <corpus-dir>/<method>.json and are automatically loaded on subsequent runs, so each run builds on the last.

SCHEMA FORMAT

The generated YAML files have the following structure:

method: method_name
confidence: high|medium|low
notes:
  - Any warnings or suggestions
input:
  param_name:
    type: string|integer|number|boolean|arrayref|hashref|object
    min: 5
    max: 100
    optional: 0
    matches: /pattern/

CONFIDENCE LEVELS

high: Strong evidence from POD and code analysis. Schema should be accurate.
medium: Partial information available. Review recommended.
low: Limited information. Manual review required.

EXAMPLES

Basic Usage

extract-schemas lib/MyModule.pm

Fuzz methods with known inputs

extract-schemas --fuzz lib/MyModule.pm

Fuzz everything, 300 iterations, custom corpus dir

extract-schemas --fuzz --fuzz-all --fuzz-iters 300 --corpus-dir t/corpus lib/MyModule.pm

Incremental fuzzing (corpus grows across runs)

# First run: builds initial corpus
extract-schemas --fuzz lib/MyModule.pm

# Subsequent runs: loads corpus and extends it
extract-schemas --fuzz lib/MyModule.pm

Verbose Mode

extract-schemas --verbose lib/MyModule.pm

Pod Checking

--strict-pod=LEVEL
  off    - do not validate POD
  warn   - warn on mismatches (default)
  fatal  - abort on mismatches

NEXT STEPS

After extracting schemas:

1. Review the generated YAML files, especially those marked low confidence 2. Edit the schemas to add missing information or correct errors 3. Use the schemas with App::Test::Generator:

test-generator --schema schemas/my_method.yaml

AUTHOR

Nigel Horne

To install App::Test::Generator, copy and paste the appropriate command in to your terminal.

cpanm

cpanm App::Test::Generator

CPAN shell

perl -MCPAN -e shell
install App::Test::Generator

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)