NAME

JSON::LINQ - LINQ-style query interface for JSON, JSONL, LTSV, and CSV files

VERSION

Version 1.02

SYNOPSIS

use JSON::LINQ;

# Read JSON file (array of objects) and query
my @results = JSON::LINQ->FromJSON("users.json")
    ->Where(sub { $_[0]{age} >= 18 })
    ->Select(sub { $_[0]{name} })
    ->Distinct()
    ->ToArray();

# Read JSONL (JSON Lines) file - one JSON object per line
my @errors = JSON::LINQ->FromJSONL("events.jsonl")
    ->Where(sub { $_[0]{level} eq 'ERROR' })
    ->ToArray();

# DSL syntax for simple filtering
my @active = JSON::LINQ->FromJSON("users.json")
    ->Where(status => 'active')
    ->ToArray();

# Grouping and aggregation
my @stats = JSON::LINQ->FromJSON("orders.json")
    ->GroupBy(sub { $_[0]{category} })
    ->Select(sub {
        my $g = shift;
        return {
            Category => $g->{Key},
            Count    => scalar(@{$g->{Elements}}),
            Total    => JSON::LINQ->From($g->{Elements})
                            ->Sum(sub { $_[0]{amount} }),
        };
    })
    ->OrderByDescending(sub { $_[0]{Total} })
    ->ToArray();

# Write results back as JSON or JSONL
JSON::LINQ->From(\@results)->ToJSON("output.json");
JSON::LINQ->From(\@results)->ToJSONL("output.jsonl");

# Read/write CSV files (Comma-Separated Values)
my @rows = JSON::LINQ->FromCSV("access.csv")
    ->Where(sub { $_[0]{status} eq '200' })
    ->ToArray();
JSON::LINQ->From(\@rows)->ToCSV("filtered.csv");

# JOIN a JSON file (main) with a CSV lookup table
my $depts = JSON::LINQ->FromCSV("departments.csv");
my @joined = JSON::LINQ->FromJSON("employees.json")
    ->Join($depts,
        sub { $_[0]{dept_id} },
        sub { $_[0]{id}      },
        sub { { name => $_[0]{name}, dept => $_[1]{name} } })
    ->ToArray();

# CSV to JSON conversion
JSON::LINQ->FromCSV("data.csv")
    ->Where(sub { $_[0]{active} eq '1' })
    ->ToJSON("active.json");

# Read/write LTSV files (Labeled Tab-Separated Values)
my @rows = JSON::LINQ->FromLTSV("access.ltsv")
    ->Where(sub { $_[0]{status} eq '200' })
    ->ToArray();
JSON::LINQ->From(\@rows)->ToLTSV("filtered.ltsv");

# JOIN a JSON file (main) with an LTSV file (sub-table)
my $depts = JSON::LINQ->FromLTSV("departments.ltsv");
my @joined = JSON::LINQ->FromJSON("employees.json")
    ->Join($depts,
        sub { $_[0]{dept_id} },
        sub { $_[0]{id}      },
        sub { { name => $_[0]{name}, dept => $_[1]{name} } })
    ->ToArray();

# JOIN an LTSV file (main) with a JSON file (sub-table)
my $prices = JSON::LINQ->FromJSON("prices.json");
my @priced = JSON::LINQ->FromLTSV("orders.ltsv")
    ->Join($prices,
        sub { $_[0]{sku} },
        sub { $_[0]{sku} },
        sub { { order_id => $_[0]{id},
                amount   => $_[0]{qty} * $_[1]{price} } })
    ->ToArray();

# Boolean values
my $rec = { active => JSON::LINQ::true, count => 0 };
JSON::LINQ->From([$rec])->ToJSON("output.json");
# ToJSON encodes as: {"active":true,"count":0}

TABLE OF CONTENTS

DESCRIPTION

JSON::LINQ provides a LINQ-style query interface for JSON, JSONL (JSON Lines), and LTSV (Labeled Tab-Separated Values) files. It is the JSON counterpart of LTSV::LINQ, sharing the same LINQ API and adding JSON-specific I/O methods.

Key features:

  • Lazy evaluation - O(1) memory for JSONL and LTSV streaming; JSON arrays are loaded once then iterated lazily

  • Method chaining - Fluent, readable query composition

  • DSL syntax - Simple key-value filtering

  • 67 LINQ methods - including JSON I/O (FromJSON, FromJSONL, FromJSONString, ToJSON, ToJSONL), LTSV I/O (FromLTSV, ToLTSV), CSV I/O (FromCSV, ToCSV), and all 60 methods from LTSV::LINQ

  • Pure Perl - No XS dependencies

  • Perl 5.005_03+ - Works on ancient and modern Perl

  • Built-in JSON parser - No CPAN JSON module required

Supported Data Sources

  • FromJSON($file) - JSON file containing a top-level array or object

  • FromJSONL($file) - JSONL file (one JSON value per line)

  • FromJSONString($json) - JSON string (array or object)

  • FromLTSV($file) - LTSV file (Labeled Tab-Separated Values)

  • FromCSV($file) - CSV file (Comma-Separated Values; also TSV via sep option)

  • From(\@array) - In-memory Perl array

  • Range($start, $count) - Integer sequence

  • Empty() - Empty sequence

  • Repeat($element, $count) - Repeated element

What is JSONL?

JSONL (JSON Lines, also known as ndjson - newline-delimited JSON) is a text format where each line is a valid JSON value (typically an object). It is particularly suited for log files and streaming data because:

  • One record per line enables streaming with O(1) memory usage

  • Compatible with standard Unix tools (grep, sed, awk)

  • Easily appendable without rewriting the whole file

  • Each line is independently parseable

Format example:

{"time":"2026-04-20T10:00:00","host":"192.0.2.1","status":200,"url":"/"}
{"time":"2026-04-20T10:00:01","host":"192.0.2.2","status":404,"url":"/missing"}

FromJSONL reads these files lazily (one line at a time), matching the memory efficiency of LTSV::LINQ's FromLTSV.

What is LINQ?

LINQ (Language Integrated Query) is the Microsoft .NET query API. This module brings the same LINQ interface to JSON data in Perl. See LTSV::LINQ for a detailed description of the LINQ design philosophy.

INCLUDED DOCUMENTATION

The eg/ directory contains sample programs:

eg/01_json_query.pl       FromJSON/Where/Select/OrderByDescending/Distinct/ToLookup
eg/02_jsonl_query.pl      FromJSONL streaming, GroupBy, aggregation, ToJSONL
eg/03_grouping.pl         GroupBy, ToLookup, GroupJoin, SelectMany, Join
eg/04_sorting.pl          OrderBy/ThenBy multi-key sort, OrderByNum vs OrderByStr
eg/05_json_ltsv_join.pl   JOIN main JSON x sub-table LTSV
eg/06_ltsv_json_join.pl   JOIN main LTSV x sub-table JSON
eg/07_csv_query.pl        FromCSV/Where/Select/GroupBy/OrderByNum/ToCSV
eg/08_csv_json_join.pl    JOIN main CSV x sub-table JSON, CSV to JSON conversion

The doc/ directory contains JSON::LINQ cheat sheets in 21 languages:

doc/json_linq_cheatsheet.EN.txt   English
doc/json_linq_cheatsheet.JA.txt   Japanese
doc/json_linq_cheatsheet.ZH.txt   Chinese (Simplified)
doc/json_linq_cheatsheet.TW.txt   Chinese (Traditional)
doc/json_linq_cheatsheet.KO.txt   Korean
doc/json_linq_cheatsheet.FR.txt   French
doc/json_linq_cheatsheet.ID.txt   Indonesian
doc/json_linq_cheatsheet.VI.txt   Vietnamese
doc/json_linq_cheatsheet.TH.txt   Thai
doc/json_linq_cheatsheet.HI.txt   Hindi
doc/json_linq_cheatsheet.BN.txt   Bengali
doc/json_linq_cheatsheet.TR.txt   Turkish
doc/json_linq_cheatsheet.MY.txt   Burmese
doc/json_linq_cheatsheet.TL.txt   Filipino
doc/json_linq_cheatsheet.KM.txt   Khmer
doc/json_linq_cheatsheet.MN.txt   Mongolian
doc/json_linq_cheatsheet.NE.txt   Nepali
doc/json_linq_cheatsheet.SI.txt   Sinhala
doc/json_linq_cheatsheet.UR.txt   Urdu
doc/json_linq_cheatsheet.UZ.txt   Uzbek
doc/json_linq_cheatsheet.BM.txt   Malay

METHODS

Complete Method Reference

This module implements 67 LINQ methods organized into 15 categories. In addition, true and false boolean accessor functions are provided.

  • Data Sources (9): From, FromJSON, FromJSONL, FromJSONString, FromLTSV, FromCSV, Range, Empty, Repeat

  • Filtering (1): Where (with DSL)

  • Projection (2): Select, SelectMany

  • Concatenation (2): Concat, Zip

  • Partitioning (4): Take, Skip, TakeWhile, SkipWhile

  • Ordering (13): OrderBy, OrderByDescending, OrderByStr, OrderByStrDescending, OrderByNum, OrderByNumDescending, Reverse, ThenBy, ThenByDescending, ThenByStr, ThenByStrDescending, ThenByNum, ThenByNumDescending

  • Grouping (1): GroupBy

  • Set Operations (4): Distinct, Union, Intersect, Except

  • Join (2): Join, GroupJoin

  • Quantifiers (3): All, Any, Contains

  • Comparison (1): SequenceEqual

  • Element Access (8): First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault, ElementAt, ElementAtOrDefault

  • Aggregation (7): Count, Sum, Min, Max, Average, AverageOrDefault, Aggregate

  • Conversion (9): ToArray, ToList, ToDictionary, ToLookup, ToJSON, ToJSONL, ToLTSV, ToCSV, DefaultIfEmpty

  • Utility (1): ForEach

JSON-Specific Data Source Methods

FromJSON($filename)

Read a JSON file containing a top-level array of values. Each element of the array becomes one item in the sequence.

my $q = JSON::LINQ->FromJSON("users.json");

If the file contains a single JSON object (not an array), it is treated as a one-element sequence.

File format:

[
  {"name": "Alice", "age": 30},
  {"name": "Bob",   "age": 25}
]

The entire file is read into memory and parsed once. For large files, consider JSONL format with FromJSONL for streaming access.

Concurrent use (e.g. Join/GroupJoin): On Perl 5.006 and later, each call to FromJSON uses a distinct numbered filehandle slot, so multiple iterators may be open simultaneously without interference. On Perl 5.005_03, a unique numbered package glob is used per call (JSON::LINQ::FH::H1, JSON::LINQ::FH::H2, ...) to achieve the same safety.

FromJSONL($filename)

Read a JSONL (JSON Lines) file. Each non-empty line is parsed as a separate JSON value. Empty lines and lines starting with # are skipped.

my $q = JSON::LINQ->FromJSONL("events.jsonl");

File format:

{"event":"login","user":"alice","ts":1713600000}
{"event":"purchase","user":"alice","ts":1713600060,"amount":29.99}
{"event":"logout","user":"alice","ts":1713600120}

FromJSONL reads lazily (one line at a time), providing O(1) memory usage for arbitrarily large files.

Invalid JSON lines produce a warning and are skipped rather than aborting the entire sequence.

Concurrent use (e.g. Join/GroupJoin): On Perl 5.006 and later, each call to FromJSONL uses a distinct numbered filehandle slot, so multiple iterators may be open simultaneously without interference. On Perl 5.005_03, a unique numbered package glob is used per call (JSON::LINQ::FH::H1, JSON::LINQ::FH::H2, ...) to achieve the same safety.

FromJSONString($json)

Create a query from a JSON string. Accepts a JSON array (each element becomes one sequence item) or a JSON object (single-element sequence).

my $q = JSON::LINQ->FromJSONString('[{"id":1},{"id":2}]');
my $q = JSON::LINQ->FromJSONString('{"id":1,"name":"Alice"}');

LTSV Interoperability

To make it easy to JOIN JSON data with LTSV master/lookup tables (or vice versa) without requiring LTSV::LINQ to be installed, JSON::LINQ ships with built-in LTSV I/O methods. The LTSV format is described at http://ltsv.org/.

FromLTSV($filename)

Read an LTSV (Labeled Tab-Separated Values) file. Each line is split on TAB, and each field is split on the first colon to produce a label/value pair. The result is a sequence of hash references.

my $q = JSON::LINQ->FromLTSV("departments.ltsv");

File format:

id:1<TAB>name:Engineering<TAB>head:Alice
id:2<TAB>name:Sales<TAB>head:Bob

FromLTSV reads lazily (one line at a time), so memory usage is O(1) even for very large files. Empty lines are skipped. CR is stripped to handle CRLF files on any platform.

Concurrent use (e.g. Join/GroupJoin): On Perl 5.006 and later, each call to FromLTSV uses a distinct numbered filehandle slot, so multiple iterators may be open simultaneously without interference. On Perl 5.005_03, a unique numbered package glob is used per call (JSON::LINQ::FH::H1, JSON::LINQ::FH::H2, ...) to achieve the same safety.

ToLTSV($filename)
ToLTSV($filename, label_order => \@labels)
ToLTSV($filename, headers => \@labels)

Write the sequence as an LTSV file. Each element must be a HASH reference. TAB, CR, and LF in values are sanitized to a single space to keep the file structurally valid.

$query->ToLTSV("output.ltsv");

Output format (default - all keys, alphabetical):

age:30<TAB>name:Alice
age:25<TAB>name:Bob

label_order (or its alias headers) specifies which labels to emit and in what order. Labels not present in a record are silently skipped.

$query->ToLTSV("output.ltsv", label_order => [qw(name age)]);
$query->ToLTSV("output.ltsv", headers     => [qw(name age)]);

Output format (with label_order):

name:Alice<TAB>age:30
name:Bob<TAB>age:25

CSV Interoperability

CSV (Comma-Separated Values) is the most widely used format for tabular data exchange. FromCSV and ToCSV let a JSON::LINQ pipeline read from and write to CSV files without requiring any extra CPAN module.

The separator character defaults to ',' but can be set to "\t" to handle TSV (Tab-Separated Values) files, or any other single character.

FromCSV($filename)
FromCSV($filename, sep => $char)
FromCSV($filename, headers => \@cols)
FromCSV($filename, headers => \@cols, skip_header => 1)

Read a CSV file. The first line is used as the header row (column names), and each subsequent data row is returned as a hash reference with those column names as keys.

Options:

sep - Field separator character (default: ','). Use "\t" for TSV.
headers - Array reference of column names. When given, the first data line is treated as data rather than a header. Combine with skip_header => 1 to skip an existing header row in the file.
skip_header - If true, skip the first line of the file even when headers is given.
# Standard CSV with header row
my $q = JSON::LINQ->FromCSV("data.csv");

# Tab-separated (TSV)
my $q = JSON::LINQ->FromCSV("data.tsv", sep => "\t");

# Headerless CSV with explicit column names
my $q = JSON::LINQ->FromCSV("noheader.csv",
    headers => [qw(name age city)]);

FromCSV reads the file lazily (one line at a time), providing O(1) memory usage for arbitrarily large files.

RFC 4180 compliance: Quoted fields (including fields containing the separator, double-quotes escaped as "", or newline characters) are handled correctly. See "LIMITATIONS AND KNOWN ISSUES" for the one known exception (multi-line quoted fields).

Concurrent use (e.g. Join/GroupJoin): Each call to FromCSV uses a unique numbered package glob (JSON::LINQ::FH::H1, H2, ...) on all Perl versions, so multiple CSV iterators may be open simultaneously without interference.

ToCSV($filename)
ToCSV($filename, sep => $char)
ToCSV($filename, headers => \@cols)
ToCSV($filename, label_order => \@cols)
ToCSV($filename, no_header => 1)

Write the sequence as a CSV file.

Options:

sep - Field separator character (default: ',').
headers - Array reference of column names that controls which keys are written and in what order. Also serves as the header row.
label_order - Alias for headers.
no_header - If true, suppress the header row entirely.
$query->ToCSV("output.csv");
$query->ToCSV("output.tsv", sep => "\t");
$query->ToCSV("output.csv", headers => [qw(name age city)]);

When headers/label_order is not supplied and elements are HASH references, column names are taken from the first record's keys in alphabetical order.

JSON-Specific Conversion Methods

ToJSON($filename)

Write the sequence as a JSON file containing a JSON array. Each element is encoded as JSON. The output is a valid JSON array.

$query->ToJSON("output.json");

Output format:

[
{"age":30,"name":"Alice"},
{"age":25,"name":"Bob"}
]

Hash keys are sorted alphabetically for deterministic output.

ToJSONL($filename)

Write the sequence as a JSONL file. Each element is written as one line of JSON. This is the streaming counterpart of ToJSON.

$query->ToJSONL("output.jsonl");

Output format:

{"age":30,"name":"Alice"}
{"age":25,"name":"Bob"}

Boolean Values

JSON::LINQ provides boolean singleton objects compatible with JSON encoding:

JSON::LINQ::true   # stringifies as "true",  numifies as 1
JSON::LINQ::false  # stringifies as "false", numifies as 0

Use these when creating data structures that will be serialised to JSON:

my $rec = { active => JSON::LINQ::true, count => 0 };
# ToJSON encodes as: {"active":true,"count":0}

When FromJSON or FromJSONL decode a JSON true or false, the result is a JSON::LINQ::Boolean object that behaves as 1 or 0 in numeric and boolean context.

All Other Methods

All other LINQ methods are inherited from LTSV::LINQ and behave identically. Please refer to LTSV::LINQ for complete documentation of:

Where, Select, SelectMany, Concat, Zip, Take, Skip, TakeWhile, SkipWhile, OrderBy, OrderByDescending, OrderByStr, OrderByStrDescending, OrderByNum, OrderByNumDescending, Reverse, ThenBy, ThenByDescending, ThenByStr, ThenByStrDescending, ThenByNum, ThenByNumDescending, GroupBy, Distinct, Union, Intersect, Except, Join, GroupJoin, All, Any, Contains, SequenceEqual, First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault, ElementAt, ElementAtOrDefault, Count, Sum, Min, Max, Average, AverageOrDefault, Aggregate, ToArray, ToList, ToDictionary, ToLookup, DefaultIfEmpty, ForEach.

EXAMPLES

Basic JSON File Query

use JSON::LINQ;

# users.json: [{"name":"Alice","age":30}, {"name":"Bob","age":25}, ...]
my @adults = JSON::LINQ->FromJSON("users.json")
    ->Where(sub { $_[0]{age} >= 18 })
    ->OrderBy(sub { $_[0]{name} })
    ->ToArray();

JSONL Streaming

# events.jsonl: one JSON object per line
my $error_count = JSON::LINQ->FromJSONL("events.jsonl")
    ->Count(sub { $_[0]{level} eq 'ERROR' });

JSON::LINQ->FromJSONL("events.jsonl")
    ->Where(sub { $_[0]{level} eq 'ERROR' })
    ->ForEach(sub { print $_[0]{message}, "\n" });

Aggregation

my $avg = JSON::LINQ->FromJSON("orders.json")
    ->Where(sub { $_[0]{status} eq 'completed' })
    ->Average(sub { $_[0]{amount} });

printf "Average order: %.2f\n", $avg;

Grouping

my @by_category = JSON::LINQ->FromJSON("products.json")
    ->GroupBy(sub { $_[0]{category} })
    ->Select(sub {
        my $g = shift;
        {
            Category => $g->{Key},
            Count    => scalar(@{$g->{Elements}}),
            MaxPrice => JSON::LINQ->From($g->{Elements})
                            ->Max(sub { $_[0]{price} }),
        }
    })
    ->OrderByDescending(sub { $_[0]{Count} })
    ->ToArray();

Transform and Write

# Read JSON, transform, write back as JSONL
JSON::LINQ->FromJSON("input.json")
    ->Select(sub {
        my $r = shift;
        return { %$r, processed => JSON::LINQ::true };
    })
    ->ToJSONL("output.jsonl");

JOIN: JSON (main) with LTSV (sub-table)

A common pattern: the primary records live in a JSON file, and a small lookup table is maintained in LTSV format. The example below reads employees from a JSON file and joins them against a department lookup table in LTSV format.

# employees.json
# [
#   {"id":1,"name":"Alice","dept_id":10},
#   {"id":2,"name":"Bob",  "dept_id":20},
#   {"id":3,"name":"Carol","dept_id":10}
# ]
#
# departments.ltsv
# id:10<TAB>name:Engineering
# id:20<TAB>name:Sales

my $depts = JSON::LINQ->FromLTSV("departments.ltsv");

my @joined = JSON::LINQ->FromJSON("employees.json")
    ->Join($depts,
        sub { $_[0]{dept_id} },     # outer key (JSON side)
        sub { $_[0]{id}      },     # inner key (LTSV side)
        sub { { name => $_[0]{name},
                dept => $_[1]{name} } })
    ->OrderBy(sub { $_[0]{name} })
    ->ToArray();

# @joined == ({name=>"Alice", dept=>"Engineering"},
#             {name=>"Bob",   dept=>"Sales"},
#             {name=>"Carol", dept=>"Engineering"})

JOIN: LTSV (main) with JSON (sub-table)

The opposite pattern: the primary records are in an LTSV log file (often high-volume, append-only), and the lookup table is in JSON.

# orders.ltsv
# id:1001<TAB>sku:A100<TAB>qty:2
# id:1002<TAB>sku:B200<TAB>qty:1
# id:1003<TAB>sku:A100<TAB>qty:5
#
# prices.json
# [
#   {"sku":"A100","price":300},
#   {"sku":"B200","price":1200}
# ]

my $prices = JSON::LINQ->FromJSON("prices.json");

my @priced = JSON::LINQ->FromLTSV("orders.ltsv")
    ->Join($prices,
        sub { $_[0]{sku} },                       # outer key (LTSV)
        sub { $_[0]{sku} },                       # inner key (JSON)
        sub { { order_id => $_[0]{id},
                amount   => $_[0]{qty} * $_[1]{price} } })
    ->ToArray();

# @priced == ({order_id=>1001, amount=>600},
#             {order_id=>1002, amount=>1200},
#             {order_id=>1003, amount=>1500})

Join builds a hash from the inner (sub-table) sequence, so it is efficient even when the outer sequence is large and read lazily.

Join builds a hash from the inner (sub-table) sequence, so it is efficient even when the outer sequence is large and read lazily.

Basic CSV Query

use JSON::LINQ;

# sales.csv:
#   name,amount,category
#   Alice,1500,A
#   Bob,800,B
#   Carol,2000,A

my @high_sales = JSON::LINQ->FromCSV("sales.csv")
    ->Where(sub { $_[0]{amount} > 1000 })
    ->OrderByNumDescending(sub { $_[0]{amount} })
    ->ToArray();

DSL Filtering on CSV

my @tokyo = JSON::LINQ->FromCSV("users.csv")
    ->Where(city => 'Tokyo')
    ->ToArray();

Grouping and Aggregation on CSV

my @by_category = JSON::LINQ->FromCSV("sales.csv")
    ->GroupBy(sub { $_[0]{category} })
    ->Select(sub {
        my $g = shift;
        {
            Category => $g->{Key},
            Count    => scalar(@{$g->{Elements}}),
            Total    => JSON::LINQ->From($g->{Elements})
                            ->Sum(sub { $_[0]{amount} }),
        }
    })
    ->OrderByStrDescending(sub { $_[0]{Total} })
    ->ToArray();

JOIN Two CSV Files

# orders.csv: id,customer_id,amount
# customers.csv: id,name,city

my $orders    = JSON::LINQ->FromCSV("orders.csv");
my $customers = JSON::LINQ->FromCSV("customers.csv");

my @joined = $orders->Join(
    $customers,
    sub { $_[0]{customer_id} },
    sub { $_[0]{id} },
    sub { { Name => $_[1]{name}, Amount => $_[0]{amount} } }
)->ToArray();

TSV Support

my @data = JSON::LINQ->FromCSV("data.tsv", sep => "\t")
    ->Where(status => 'active')
    ->ToArray();

CSV Round-Trip (Filter and Write)

JSON::LINQ->FromCSV("input.csv")
    ->Where(sub { $_[0]{active} eq '1' })
    ->ToCSV("active.csv");

CSV to JSON Conversion

JSON::LINQ->FromCSV("data.csv")
    ->Select(sub {
        my $r = shift;
        return { %$r, processed => JSON::LINQ::true };
    })
    ->ToJSON("data.json");

In-Memory Array Query

my @data = (
    {name => 'Alice', score => 95},
    {name => 'Bob',   score => 72},
    {name => 'Carol', score => 88},
);

my @top = JSON::LINQ->From(\@data)
    ->Where(sub { $_[0]{score} >= 80 })
    ->OrderByDescending(sub { $_[0]{score} })
    ->ToArray();

FEATURES

Lazy Evaluation

FromJSONL reads one line at a time. Combined with Where and Take, only the needed records are ever in memory simultaneously.

FromJSON reads the whole file once but then iterates the array lazily.

Built-in JSON Parser

JSON::LINQ contains its own JSON encoder/decoder (derived from mb::JSON 0.06). No CPAN JSON module is required. The parser handles:

  • UTF-8 multibyte strings (output as-is, not \uXXXX-escaped)

  • \uXXXX escape sequences on input (converted to UTF-8)

  • All JSON types: object, array, string, number, true, false, null

  • Nested structures of arbitrary depth

ARCHITECTURE

Relationship to LTSV::LINQ

JSON::LINQ and LTSV::LINQ are parallel modules sharing the same LINQ API.

LTSV::LINQ  - LINQ for LTSV (Labeled Tab-Separated Values) files
JSON::LINQ  - LINQ for JSON and JSONL files

Both share the same LINQ API. JSON::LINQ adds the following I/O methods on top of LTSV::LINQ's interface:

FromJSON($file)         - read JSON array file
FromJSONL($file)        - read JSONL file (streaming)
FromJSONString($json)   - read JSON string
FromLTSV($file)         - read LTSV file (streaming)
FromCSV($file)          - read CSV file (streaming, RFC 4180)
ToJSON($file)           - write JSON array file
ToJSONL($file)          - write JSONL file
ToLTSV($file)           - write LTSV file (streaming)
ToCSV($file)            - write CSV file

FromLTSV, ToLTSV, FromCSV, and ToCSV are provided so a JSON::LINQ pipeline can JOIN against (or emit into) LTSV and CSV files without requiring LTSV::LINQ or CSV::LINQ to be installed.

The internal iterator architecture is identical: each operator returns a new query object wrapping a closure.

Memory Characteristics

FromJSONL  - O(1) per record: one line at a time
FromJSON   - O(n): entire file loaded once, then lazy iteration
FromLTSV   - O(1) per record: one line at a time
FromCSV    - O(1) per record: one line at a time
ToJSON     - O(n): entire sequence collected for array output
ToJSONL    - O(1) per record: streaming write
ToLTSV     - O(1) per record: streaming write
ToCSV      - O(n): entire sequence collected before writing header

COMPATIBILITY

Perl Version Support

Compatible with Perl 5.00503 and later. See LTSV::LINQ for the full compatibility rationale (Universal Consensus 1998 / Perl 5.005_03).

Pure Perl Implementation

No XS dependencies. No CPAN module dependencies. Works on any Perl installation with only the standard core.

JSON Limitations

The built-in parser has the same limitations as mb::JSON 0.06:

  • Surrogate pairs (\uD800-\uDFFF) are not supported

  • Circular references in encoding cause infinite recursion

  • Non-ARRAY/HASH references are stringified

Iterator Protocol and JSON null

The internal iterator protocol uses undef to signal end-of-sequence. As a consequence, an undef value (i.e. a decoded JSON null) cannot appear as a top-level element of a sequence: it would be indistinguishable from EOF and the sequence would be silently truncated at that point.

This affects Select in particular: a selector that returns undef for some elements will terminate the sequence early.

# JSON: [{"v":1},{"v":null},{"v":3}]
JSON::LINQ->FromJSON("data.json")
          ->Select(sub { $_[0]{v} })
          ->ToArray;
# returns (1) - sequence stops at the undef from the second record

Where is unaffected when filtering hash records (the hashref itself is the element, not its v field), but a Select that projects a nullable field will be truncated at the first null. Workarounds:

  • Project to a sentinel value: Select(sub { defined $_[0]{v} ? $_[0]{v} : '' })

  • Wrap each element in a hashref so the element itself is never undef.

DefaultIfEmpty(undef) is similarly affected: a default of undef is silently lost. Use a non-undef sentinel (0, '', {}) instead.

DIAGNOSTICS

JSON::LINQ::FromJSON: cannot parse '$file': $@

The file exists but does not contain valid JSON.

JSON::LINQ::FromJSON: '$file' must contain a JSON array or object

The file contains valid JSON but the top-level value is a string, number, or boolean, not an array or object.

JSON::LINQ::FromJSONL: skipping invalid JSON line: $@

A line in a JSONL file could not be parsed. The line is skipped with a warning; processing continues.

JSON::LINQ::FromJSONString: cannot parse JSON: $@

The supplied JSON string is not valid JSON.

JSON::LINQ::_json_decode: ...

Internal JSON parsing error. The message includes the specific unexpected token or an indication of where parsing stopped.

JSON::LINQ::_json_decode: expected ',' or ']' in array

The JSON array was not properly terminated or separated.

JSON::LINQ::_json_decode: expected ',' or '}' in object

A JSON object was not properly terminated or separated.

JSON::LINQ::_json_decode: expected ':' after key '$key'

The colon separator was missing after a JSON object key.

JSON::LINQ::_json_decode: expected string key in object

A JSON object key was not a quoted string.

JSON::LINQ::_json_decode: trailing garbage:

Extra text was found after a successfully parsed top-level JSON value. The message is followed by the first 20 characters of the unexpected text.

JSON::LINQ::_json_decode: unexpected end of input

The JSON text ended before a complete value was parsed.

JSON::LINQ::_json_decode: unexpected token:

An unrecognised token was encountered while parsing JSON. The message is followed by the first 20 characters of the unexpected text.

JSON::LINQ::_json_decode: unterminated string

A JSON string was not closed with a double-quote.

Cannot open '$file': $!

Thrown by FromJSON, FromJSONL, FromLTSV, or FromCSV when the input file cannot be opened.

Cannot open '$filename': $!

Thrown by ToJSON, ToJSONL, ToLTSV, or ToCSV when the output file cannot be opened.

From() requires ARRAY reference

Thrown by From() when the argument is not an array reference.

Index must be non-negative

Thrown by ElementAt() when the supplied index is less than zero.

Index out of range

Thrown by ElementAt() when the index is beyond the end of the sequence. Use ElementAtOrDefault() to avoid this error.

Invalid number of arguments for Aggregate

Thrown by Aggregate() when called with an argument count other than 1, 2, or 3.

Sequence contains no elements

Thrown by First(), Last(), Average(), Aggregate() (no-seed form), and Single() when the sequence is empty or no element satisfies the predicate.

Sequence contains more than one element

Thrown by Single() when more than one element (or matching element) is found.

No element satisfies the condition

Thrown by First() or Last() with a predicate when no element matches.

SelectMany: selector must return an ARRAY reference

Thrown by SelectMany() when the selector function returns a non-array value.

All other error messages are identical to LTSV::LINQ.

LIMITATIONS AND KNOWN ISSUES

  • Iterator Consumption

    Query objects can only be consumed once. The iterator is exhausted after terminal operations (ToArray, Count, Sum, ToCSV, etc.). Create a new query or save the ToArray() result to reuse data.

  • Undef Values

    Due to the iterator-based design, undef signals end-of-sequence. A Select selector that returns undef will terminate the sequence early. See "Iterator Protocol and JSON null" for details and workarounds.

  • Multi-line CSV Fields

    FromCSV reads the file one line at a time. RFC 4180 quoted fields that contain embedded newlines (multi-line fields) are not yet supported. Single-line quoted fields containing commas and escaped double-quotes ("") are handled correctly.

  • No Parallel Execution

    All operations execute sequentially in a single thread.

BUGS

Please report bugs to ina@cpan.org.

SEE ALSO

AUTHOR

INABA Hitoshi <ina@cpan.org>

COPYRIGHT AND LICENSE

Copyright (c) 2026 INABA Hitoshi

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.