NAME

JSONL::Subset - Extract a percentage of lines from a JSONL file

SYNOPSIS

use JSONL::Subset qw(subset_jsonl);

subset_jsonl(
    infile  => "data.jsonl",
    outfile => "subset.jsonl",
    percent => 10,
    mode    => "random",  # or "start", "end"
    seed    => 42
);

DESCRIPTION

This module helps you extract a subset of lines from a JSONL file, for sampling or inspection.

OPTIONS

infile

Path to the file you want to import from.

outfile

Path to where you want to save the export.

percent

Percentage of lines to retain.

mode

- random returns random lines - start returns lines from the start - end returns lines from the end

seed

Only used with random, for reproducability. (optional)

streaming

If set, infile will be streamed line by line. This makes the process take less RAM, but more wall time.

Recommended for large JSONL files.