NAME
JSONL::Subset - Extract a percentage of lines from a JSONL file
SYNOPSIS
use JSONL::Subset qw(subset_jsonl);
subset_jsonl(
infile => "data.jsonl",
outfile => "subset.jsonl",
percent => 10,
mode => "random", # or "start", "end"
seed => 42
);
DESCRIPTION
This module helps you extract a subset of lines from a JSONL file, for sampling or inspection.
OPTIONS
infile
Path to the file you want to import from.
outfile
Path to where you want to save the export.
percent
Percentage of lines to retain.
mode
- random returns random lines - start returns lines from the start - end returns lines from the end
seed
Only used with random, for reproducability. (optional)
streaming
If set, infile will be streamed line by line. This makes the process take less RAM, but more wall time.
Recommended for large JSONL files.