NAME

LTSV::LINQ - LINQ-style query interface for LTSV files

VERSION

Version 1.00

SYNOPSIS

use LTSV::LINQ;

# Read LTSV file and query
my @results = LTSV::LINQ->FromLTSV("access.log")
    ->Where(sub { $_[0]{status} eq '200' })
    ->Select(sub { $_[0]{url} })
    ->Distinct()
    ->ToArray();

# DSL syntax for simple filtering
my @errors = LTSV::LINQ->FromLTSV("access.log")
    ->Where(status => '404')
    ->ToArray();

# Grouping and aggregation
my @stats = LTSV::LINQ->FromLTSV("access.log")
    ->GroupBy(sub { $_[0]{status} })
    ->Select(sub {
        my $g = shift;
        return {
            Status => $g->{Key},
            Count => scalar(@{$g->{Elements}})
        };
    })
    ->OrderByDescending(sub { $_[0]{Count} })
    ->ToArray();

TABLE OF CONTENTS

DESCRIPTION

LTSV::LINQ provides a LINQ-style query interface for LTSV (Labeled Tab-Separated Values) files. It offers a fluent, chainable API for filtering, transforming, and aggregating LTSV data.

Key features:

  • Lazy evaluation - O(1) memory usage for most operations

  • Method chaining - Fluent, readable query composition

  • DSL syntax - Simple key-value filtering

  • 30+ LINQ methods - Comprehensive query capabilities

  • Pure Perl - No XS dependencies

  • Perl 5.5.3+ - Works on ancient and modern Perl

What is LTSV?

LTSV (Labeled Tab-Separated Values) is a format for structured logs. Each line contains tab-separated key:value pairs.

Example:

time:2026-02-13T10:00:00	status:200	url:/index.html	bytes:1024

For more information: http://ltsv.org/

What is LINQ?

LINQ (Language Integrated Query) is a query syntax in C# and .NET. This module brings LINQ-style querying to Perl for LTSV data.

For more information: https://learn.microsoft.com/en-us/dotnet/csharp/linq/

METHODS

Complete Method Reference

This module implements 30 LINQ-style methods organized into 12 categories:

  • Data Sources (3): From, FromLTSV, Range

  • Filtering (1): Where (with DSL)

  • Projection (2): Select, SelectMany

  • Partitioning (3): Take, Skip, TakeWhile

  • Ordering (3): OrderBy, OrderByDescending, Reverse

  • Grouping (1): GroupBy

  • Set Operations (1): Distinct

  • Quantifiers (2): All, Any

  • Element Access (3): First, FirstOrDefault, Last

  • Aggregation (5): Count, Sum, Min, Max, Average

  • Conversion (3): ToArray, ToList, ToLTSV

  • Utility (1): ForEach

Method Summary Table:

Method              Category        Lazy?  Returns
==================  ==============  =====  ================
From                Data Source     Yes    Query
FromLTSV            Data Source     Yes    Query
Range               Data Source     Yes    Query
Where               Filtering       Yes    Query
Select              Projection      Yes    Query
SelectMany          Projection      Yes    Query
Take                Partitioning    Yes    Query
Skip                Partitioning    Yes    Query
TakeWhile           Partitioning    Yes    Query
OrderBy             Ordering        No*    Query
OrderByDescending   Ordering        No*    Query
Reverse             Ordering        No*    Query
GroupBy             Grouping        No*    Query
Distinct            Set Operation   Yes    Query
All                 Quantifier      No     Boolean
Any                 Quantifier      No     Boolean
First               Element Access  No     Element
FirstOrDefault      Element Access  No     Element
Last                Element Access  No*    Element
Count               Aggregation     No     Integer
Sum                 Aggregation     No     Number
Min                 Aggregation     No     Number
Max                 Aggregation     No     Number
Average             Aggregation     No     Number
AverageOrDefault    Aggregation     No     Number or undef
ToArray             Conversion      No     Array
ToList              Conversion      No     ArrayRef
ToLTSV              Conversion      No     Boolean
ForEach             Utility         No     Void

* Materializing operation (loads all data into memory)

Data Source Methods

From(\@array)

Create a query from an array.

my $query = LTSV::LINQ->From([{name => 'Alice'}, {name => 'Bob'}]);
FromLTSV($filename)

Create a query from an LTSV file.

my $query = LTSV::LINQ->FromLTSV("access.log");
Range($start, $count)

Generate a sequence of integers.

my $query = LTSV::LINQ->Range(1, 10);  # 1, 2, ..., 10

Filtering Methods

Where($predicate)
Where(key => value, ...)

Filter elements. Accepts either a code reference or DSL form.

Code Reference Form:

->Where(sub { $_[0]{status} == 200 })
->Where(sub { $_[0]{status} >= 400 && $_[0]{bytes} > 1000 })

The code reference receives each element as $_[0] and should return true to include the element, false to exclude it.

DSL Form:

The DSL (Domain Specific Language) form provides a concise syntax for simple equality comparisons. All conditions are combined with AND logic.

# Single condition
->Where(status => '200')

# Multiple conditions (AND)
->Where(status => '200', method => 'GET')

# Equivalent to:
->Where(sub { 
    $_[0]{status} eq '200' && $_[0]{method} eq 'GET' 
})

DSL Specification:

  • All comparisons are string equality (eq)

  • All conditions are combined with AND

  • Undefined values are treated as failures

  • For numeric or OR logic, use code reference form

Examples:

# DSL: Simple and readable
->Where(status => '200')
->Where(user => 'alice', role => 'admin')

# Code ref: Complex logic
->Where(sub { $_[0]{status} >= 400 && $_[0]{status} < 500 })
->Where(sub { $_[0]{user} eq 'alice' || $_[0]{user} eq 'bob' })

Projection Methods

Select($selector)

Transform each element using the provided selector function.

The selector receives each element as $_[0] and should return the transformed value.

Parameters:

  • $selector - Code reference that transforms each element

Returns: New query with transformed elements (lazy)

Examples:

# Extract single field
->Select(sub { $_[0]{url} })

# Transform to new structure
->Select(sub { 
    { 
        path => $_[0]{url}, 
        code => $_[0]{status} 
    } 
})

# Calculate derived values
->Select(sub { $_[0]{bytes} * 8 })  # bytes to bits

Note: Select preserves one-to-one mapping. For one-to-many, use SelectMany.

SelectMany($selector)

Flatten nested sequences into a single sequence.

The selector should return an array reference. All arrays are flattened into a single sequence.

Parameters:

  • $selector - Code reference returning array reference

Returns: New query with flattened elements (lazy)

Examples:

# Flatten array of arrays
my @nested = ([1, 2], [3, 4], [5]);
LTSV::LINQ->From(\@nested)
    ->SelectMany(sub { $_[0] })
    ->ToArray();  # (1, 2, 3, 4, 5)

# Expand related records
->SelectMany(sub {
    my $user = shift;
    return [ map { 
        { user => $user->{name}, role => $_ } 
    } @{$user->{roles}} ];
})

Use Cases:

  • Flattening nested arrays

  • Expanding one-to-many relationships

  • Generating multiple outputs per input

Partitioning Methods

Take($count)

Take the first N elements from the sequence.

Parameters:

  • $count - Number of elements to take (integer >= 0)

Returns: New query limited to first N elements (lazy)

Examples:

# Top 10 results
->OrderByDescending(sub { $_[0]{score} })
  ->Take(10)

# First record only
->Take(1)->ToArray()

# Limit large file processing
LTSV::LINQ->FromLTSV("huge.log")->Take(1000)

Note: Take(0) returns empty sequence. Negative values treated as 0.

Skip($count)

Skip the first N elements, return the rest.

Parameters:

  • $count - Number of elements to skip (integer >= 0)

Returns: New query skipping first N elements (lazy)

Examples:

# Skip header row
->Skip(1)

# Pagination: page 3, size 20
->Skip(40)->Take(20)

# Skip first batch
->Skip(1000)->ForEach(sub { ... })

Use Cases:

  • Pagination

  • Skipping header rows

  • Processing in batches

TakeWhile($predicate)

Take elements while the predicate is true. Stops at first false.

Parameters:

  • $predicate - Code reference returning boolean

Returns: New query taking elements while predicate holds (lazy)

Examples:

# Take while value is small
->TakeWhile(sub { $_[0]{count} < 100 })

# Take while timestamp is in range
->TakeWhile(sub { $_[0]{time} lt '2026-02-01' })

# Process until error
->TakeWhile(sub { $_[0]{status} < 400 })

Important: TakeWhile stops immediately when predicate returns false. It does NOT filter - it terminates the sequence.

# Different from Where:
->TakeWhile(sub { $_[0] < 5 })  # 1,2,3,4 then STOP
->Where(sub { $_[0] < 5 })      # 1,2,3,4 (checks all)

Ordering Methods

OrderBy($key_selector)

Sort in ascending order.

->OrderBy(sub { $_[0]{timestamp} })
OrderByDescending($key_selector)

Sort in descending order.

->OrderByDescending(sub { $_[0]{count} })
Reverse()

Reverse the order.

->Reverse()

Grouping Methods

GroupBy($key_selector [, $element_selector])

Group elements by key.

->GroupBy(sub { $_[0]{status} })

Returns array of hashrefs with 'Key' and 'Elements' fields.

Set Operations

Distinct([$comparer])

Remove duplicate elements.

->Distinct()

Quantifier Methods

All($predicate)

Test if all elements satisfy condition.

->All(sub { $_[0]{status} == 200 })
Any([$predicate])

Test if any element satisfies condition.

->Any(sub { $_[0]{status} >= 400 })
->Any()  # Test if sequence is non-empty

Element Access Methods

First([$predicate])

Get first element. Dies if empty.

->First()
->First(sub { $_[0]{status} == 404 })
FirstOrDefault([$predicate,] $default)

Get first element or default value.

->FirstOrDefault(undef, {})
Last([$predicate])

Get last element. Dies if empty.

->Last()

Aggregation Methods

All aggregation methods are terminal operations - they consume the entire sequence and return a scalar value.

Count([$predicate])

Count the number of elements.

Parameters:

  • $predicate - (Optional) Code reference to filter elements

Returns: Integer count

Examples:

# Count all
->Count()  # 1000

# Count with condition
->Count(sub { $_[0]{status} >= 400 })  # 42

# Equivalent to
->Where(sub { $_[0]{status} >= 400 })->Count()

Performance: O(n) - must iterate entire sequence

Sum([$selector])

Calculate sum of numeric values.

Parameters:

  • $selector - (Optional) Code reference to extract value. Default: identity function

Returns: Numeric sum

Examples:

# Sum of values
LTSV::LINQ->From([1, 2, 3, 4, 5])->Sum()  # 15

# Sum of field
->Sum(sub { $_[0]{bytes} })

# Sum with transformation
->Sum(sub { $_[0]{price} * $_[0]{quantity} })

Note: Non-numeric values may produce warnings. Use numeric context.

Min([$selector])

Find minimum value.

Parameters:

  • $selector - (Optional) Code reference to extract value

Returns: Minimum value (numeric comparison)

Examples:

# Minimum of values
->Min()

# Minimum of field
->Min(sub { $_[0]{response_time} })

# Oldest timestamp
->Min(sub { $_[0]{timestamp} })

Returns: undef if sequence is empty

Max([$selector])

Find maximum value.

Parameters:

  • $selector - (Optional) Code reference to extract value

Returns: Maximum value (numeric comparison)

Examples:

# Maximum of values
->Max()

# Maximum of field
->Max(sub { $_[0]{bytes} })

# Latest timestamp
->Max(sub { $_[0]{timestamp} })

Returns: undef if sequence is empty

Average([$selector])

Calculate arithmetic mean.

Parameters:

  • $selector - (Optional) Code reference to extract value

Returns: Numeric average (floating point)

Examples:

# Average of values
LTSV::LINQ->From([1, 2, 3, 4, 5])->Average()  # 3

# Average of field
->Average(sub { $_[0]{bytes} })

# Average response time
->Average(sub { $_[0]{response_time} })

Throws: Dies with "Sequence contains no elements" if empty

Note: Returns floating point. Use int() for integer result.

AverageOrDefault([$selector])

Calculate arithmetic mean, or return undef if sequence is empty.

Parameters:

  • $selector - (Optional) Code reference to extract value

Returns: Numeric average (floating point), or undef if empty

Examples:

# Safe average - returns undef for empty sequence
my @empty = ();
my $avg = LTSV::LINQ->From(\@empty)->AverageOrDefault();  # undef

# With data
LTSV::LINQ->From([1, 2, 3])->AverageOrDefault();  # 2

# With selector
->AverageOrDefault(sub { $_[0]{value} })

Note: Unlike Average(), this method never throws an exception.

Conversion Methods

ToArray()

Convert to array.

my @array = $query->ToArray();
ToList()

Convert to array reference.

my $arrayref = $query->ToList();
ToLTSV($filename)

Write to LTSV file.

$query->ToLTSV("output.ltsv");

Utility Methods

ForEach($action)

Execute action for each element.

$query->ForEach(sub { print $_[0]{url}, "\n" });

EXAMPLES

Basic Filtering

use LTSV::LINQ;

# DSL syntax
my @successful = LTSV::LINQ->FromLTSV("access.log")
    ->Where(status => '200')
    ->ToArray();

# Code reference
my @errors = LTSV::LINQ->FromLTSV("access.log")
    ->Where(sub { $_[0]{status} >= 400 })
    ->ToArray();

Aggregation

# Count errors
my $error_count = LTSV::LINQ->FromLTSV("access.log")
    ->Where(sub { $_[0]{status} >= 400 })
    ->Count();

# Average bytes for successful requests
my $avg_bytes = LTSV::LINQ->FromLTSV("access.log")
    ->Where(status => '200')
    ->Average(sub { $_[0]{bytes} });

print "Average bytes: $avg_bytes\n";

Grouping and Ordering

# Top 10 URLs by request count
my @top_urls = LTSV::LINQ->FromLTSV("access.log")
    ->Where(sub { $_[0]{status} eq '200' })
    ->GroupBy(sub { $_[0]{url} })
    ->Select(sub {
        my $g = shift;
        return {
            URL => $g->{Key},
            Count => scalar(@{$g->{Elements}}),
            TotalBytes => LTSV::LINQ->From($g->{Elements})
                ->Sum(sub { $_[0]{bytes} })
        };
    })
    ->OrderByDescending(sub { $_[0]{Count} })
    ->Take(10)
    ->ToArray();

for my $stat (@top_urls) {
    printf "%5d requests - %s (%d bytes)\n",
        $stat->{Count}, $stat->{URL}, $stat->{TotalBytes};
}

Complex Query Chain

# Multi-step analysis
my @result = LTSV::LINQ->FromLTSV("access.log")
    ->Where(status => '200')              # Filter successful
    ->Select(sub { $_[0]{bytes} })         # Extract bytes
    ->Where(sub { $_[0] > 1000 })          # Large responses only
    ->OrderByDescending(sub { $_[0] })     # Sort descending
    ->Take(100)                             # Top 100
    ->ToArray();

print "Largest 100 successful responses:\n";
print "  ", join(", ", @result), "\n";

Lazy Processing of Large Files

# Process huge file with constant memory
LTSV::LINQ->FromLTSV("huge.log")
    ->Where(sub { $_[0]{level} eq 'ERROR' })
    ->ForEach(sub {
        my $rec = shift;
        print "ERROR at $rec->{time}: $rec->{message}\n";
    });

Quantifiers

# Check if all requests are successful
my $all_ok = LTSV::LINQ->FromLTSV("access.log")
    ->All(sub { $_[0]{status} < 400 });

print $all_ok ? "All OK\n" : "Some errors\n";

# Check if any errors exist
my $has_errors = LTSV::LINQ->FromLTSV("access.log")
    ->Any(sub { $_[0]{status} >= 500 });

print "Server errors detected\n" if $has_errors;

Data Transformation

# Read LTSV, transform, write back
LTSV::LINQ->FromLTSV("input.ltsv")
    ->Select(sub {
        my $rec = shift;
        return {
            %$rec,
            processed => 1,
            timestamp => time(),
        };
    })
    ->ToLTSV("output.ltsv");

Working with Arrays

# Query in-memory data
my @data = (
    {name => 'Alice', age => 30, city => 'Tokyo'},
    {name => 'Bob',   age => 25, city => 'Osaka'},
    {name => 'Carol', age => 35, city => 'Tokyo'},
);

my @tokyo_residents = LTSV::LINQ->From(\@data)
    ->Where(city => 'Tokyo')
    ->OrderBy(sub { $_[0]{age} })
    ->ToArray();

FEATURES

Lazy Evaluation

All query operations use lazy evaluation via iterators. Data is processed on-demand, not all at once.

# Only reads 10 records from file
my @top10 = LTSV::LINQ->FromLTSV("huge.log")
    ->Take(10)
    ->ToArray();

Method Chaining

All methods (except terminal operations like ToArray) return a new query object, enabling fluent method chaining.

->Where(...)->Select(...)->OrderBy(...)->Take(10)

DSL Syntax

Simple key-value filtering without code references.

# Readable and concise
->Where(status => '200', method => 'GET')

# Instead of
->Where(sub { $_[0]{status} eq '200' && $_[0]{method} eq 'GET' })

ARCHITECTURE

Iterator-Based Design

LTSV::LINQ uses an iterator-based architecture for lazy evaluation.

Core Concept:

Each query operation returns a new query object wrapping an iterator (a code reference that produces one element per call).

my $iter = sub {
    # Read next element
    # Apply transformation
    # Return element or undef
};

my $query = LTSV::LINQ->new($iter);

Benefits:

  • Memory Efficiency - O(1) memory for most operations

  • Lazy Evaluation - Elements computed on-demand

  • Composability - Iterators chain naturally

  • Early Termination - Stop processing when done

Method Categories

Lazy Operations (return new query):

These operations return immediately, creating a new query object. No data processing occurs until a terminal operation is called.

  • Where, Select, SelectMany

  • Take, Skip, TakeWhile

  • Distinct

Terminal Operations (return value):

These operations consume the iterator and return a result. All lazy operations are executed at this point.

  • ToArray, ToList, ToLTSV

  • Count, Sum, Min, Max, Average

  • First, FirstOrDefault, Last

  • All, Any

  • ForEach

Materializing Operations (return new query, but eager):

These operations must consume the entire input before proceeding.

  • OrderBy, OrderByDescending, Reverse

  • GroupBy

Query Execution Flow

# Build query (lazy - no execution yet)
my $query = LTSV::LINQ->FromLTSV("access.log")
    ->Where(status => '200')      # Lazy
    ->Select(sub { $_[0]{url} })  # Lazy
    ->Distinct();                  # Lazy

# Execute query (terminal operation)
my @results = $query->ToArray();  # Now executes entire chain

Execution Order:

1. FromLTSV opens file and creates iterator
2. Where wraps iterator with filter
3. Select wraps with transformation
4. Distinct wraps with deduplication
5. ToArray pulls elements through chain

Each element flows through the entire chain before the next element is read.

Memory Characteristics

Constant Memory Operations:

  • Where, Select, SelectMany

  • Take, Skip, TakeWhile

  • Distinct (with hash, O(unique elements))

  • ForEach, Count, Sum, Min, Max, Average

  • First, FirstOrDefault, Any, All

Linear Memory Operations:

  • ToArray, ToList (O(n))

  • OrderBy, OrderByDescending, Reverse (O(n))

  • GroupBy (O(n))

  • Last, LastOrDefault (O(n))

PERFORMANCE

Memory Efficiency

Lazy evaluation means memory usage is O(1) for most operations, regardless of input size.

# Processes 1GB file with constant memory
LTSV::LINQ->FromLTSV("1gb.log")
    ->Where(status => '500')
    ->ForEach(sub { print $_[0]{url}, "\n" });

Terminal Operations

These operations materialize the entire result set:

  • ToArray, ToList

  • OrderBy, OrderByDescending, Reverse

  • GroupBy

  • Last

For large datasets, use these operations carefully.

Optimization Tips

  • Filter early: Place Where clauses first

    # Good: Filter before expensive operations
    ->Where(status => '200')->OrderBy(...)->Take(10)
    
    # Bad: Order all data, then filter
    ->OrderBy(...)->Where(status => '200')->Take(10)
  • Limit early: Use Take to reduce processing

    # Process only what you need
    ->Take(1000)->GroupBy(...)
  • Avoid repeated ToArray: Reuse results

    # Bad: Calls ToArray twice
    my $count = scalar($query->ToArray());
    my @items = $query->ToArray();
    
    # Good: Call once, reuse
    my @items = $query->ToArray();
    my $count = scalar(@items);

COMPATIBILITY

Perl Version Support

This module is compatible with Perl 5.00503 and later.

Tested on:

  • Perl 5.005_03 (released 1999)

  • Perl 5.6.x

  • Perl 5.8.x

  • Perl 5.10.x - 5.40.x

Compatibility Policy

Ancient Perl Support:

This module maintains compatibility with Perl 5.005_03 through careful coding practices:

  • No use of features introduced after 5.005

  • use warnings compatibility shim for pre-5.6

  • our keyword avoided (5.6+ feature)

  • Three-argument open used (5.6+ but safe)

  • No Unicode features required

  • No module dependencies beyond core

Why Perl 5.005 Support?:

Many production systems, especially in enterprise and embedded environments, still run Perl 5.005 or 5.6. This module provides modern query capabilities to these systems without requiring upgrades.

Pure Perl Implementation

No XS Dependencies:

This module is implemented in Pure Perl with no XS (C extensions). Benefits:

  • Works on any Perl installation

  • No C compiler required

  • Easy installation in restricted environments

  • Consistent behavior across platforms

  • Simpler debugging and maintenance

Core Module Dependencies

None. This module uses only Perl core features available since 5.005.

No CPAN dependencies required.

DIAGNOSTICS

Error Messages

This module may throw the following exceptions:

From() requires ARRAY reference

Thrown by From() when the argument is not an array reference.

Example:

LTSV::LINQ->From("string");  # Dies
LTSV::LINQ->From([1, 2, 3]); # OK
Sequence contains no elements

Thrown by First(), Last(), or Average() when called on an empty sequence.

Methods that throw this error:

  • First()

  • Last()

  • Average()

To avoid this error, use the OrDefault variants:

  • FirstOrDefault() - returns undef instead of dying

  • LastOrDefault() - returns undef instead of dying

  • AverageOrDefault() - returns undef instead of dying

Example:

my @empty = ();
LTSV::LINQ->From(\@empty)->First();          # Dies
LTSV::LINQ->From(\@empty)->FirstOrDefault(); # Returns undef
No element satisfies the condition

Thrown by First() or Last() with a predicate when no element matches.

Example:

my @data = (1, 2, 3);
LTSV::LINQ->From(\@data)->First(sub { $_[0] > 10 });          # Dies
LTSV::LINQ->From(\@data)->FirstOrDefault(sub { $_[0] > 10 }); # Returns undef
Cannot open 'filename': ...

File I/O error when FromLTSV() cannot open the specified file.

Common causes:

  • File does not exist

  • Insufficient permissions

  • Invalid path

Example:

LTSV::LINQ->FromLTSV("/nonexistent/file.ltsv"); # Dies with this error

Methods That May Throw Exceptions

From($array_ref)

Dies if argument is not an array reference.

FromLTSV($filename)

Dies if file cannot be opened.

First([$predicate])

Dies if sequence is empty or no element matches predicate.

Safe alternative: FirstOrDefault()

Last([$predicate])

Dies if sequence is empty or no element matches predicate.

Safe alternative: LastOrDefault()

Average([$selector])

Dies if sequence is empty.

Safe alternative: AverageOrDefault()

Safe Alternatives

For methods that may throw exceptions, use the OrDefault variants:

First()   → FirstOrDefault()   (returns undef)
Last()    → LastOrDefault()    (returns undef)
Average() → AverageOrDefault() (returns undef)

Example:

# Unsafe - may die
my $first = LTSV::LINQ->From(\@data)->First();

# Safe - returns undef if empty
my $first = LTSV::LINQ->From(\@data)->FirstOrDefault();
if (defined $first) {
    # Process $first
}

FAQ

General Questions

Q: Why LINQ-style instead of SQL-style?

A: LINQ provides:

  • Method chaining (more Perl-like)

  • Type safety through code

  • No string parsing required

  • Composable queries

Q: Can I reuse a query object?

A: No. Query objects use iterators that can only be consumed once.

# Wrong - iterator consumed by first ToArray
my $query = LTSV::LINQ->FromLTSV("file.ltsv");
my @first = $query->ToArray();   # OK
my @second = $query->ToArray();  # Empty! Iterator exhausted

# Right - create new query for each use
my $query1 = LTSV::LINQ->FromLTSV("file.ltsv");
my @first = $query1->ToArray();

my $query2 = LTSV::LINQ->FromLTSV("file.ltsv");
my @second = $query2->ToArray();
Q: How do I do OR conditions in Where?

A: Use code reference form with ||:

# OR condition requires code reference
->Where(sub { 
    $_[0]{status} == 200 || $_[0]{status} == 304 
})

# DSL only supports AND
->Where(status => '200')  # Single condition only
Q: Why does my query seem to run multiple times?

A: Some operations require multiple passes:

# This reads the file TWICE
my $avg = $query->Average(...);    # Pass 1: Calculate
my @all = $query->ToArray();       # Pass 2: Collect (iterator reset!)

# Save result instead
my @all = $query->ToArray();
my $avg = LTSV::LINQ->From(\@all)->Average(...);

Performance Questions

Q: How can I process a huge file efficiently?

A: Use lazy operations and avoid materializing:

# Good - constant memory
LTSV::LINQ->FromLTSV("huge.log")
    ->Where(status => '500')
    ->ForEach(sub { print $_[0]{message}, "\n" });

# Bad - loads everything into memory
my @all = LTSV::LINQ->FromLTSV("huge.log")->ToArray();
Q: Why is OrderBy slow on large files?

A: OrderBy must load all elements into memory to sort them.

# Slow on 1GB file - loads everything
->OrderBy(sub { $_[0]{timestamp} })->Take(10)

# Faster - limit before sorting (if possible)
->Where(status => '500')->OrderBy(...)->Take(10)
Q: How do I process files larger than memory?

A: Use ForEach or streaming terminal operations:

# Process 100GB file with 1KB memory
my $error_count = 0;
LTSV::LINQ->FromLTSV("100gb.log")
    ->Where(sub { $_[0]{level} eq 'ERROR' })
    ->ForEach(sub { $error_count++ });

print "Errors: $error_count\n";

DSL Questions

Q: Can DSL do numeric comparisons?

A: No. DSL uses string equality (eq). Use code reference for numeric:

# DSL - string comparison
->Where(status => '200')  # $_[0]{status} eq '200'

# Code ref - numeric comparison
->Where(sub { $_[0]{status} == 200 })
->Where(sub { $_[0]{bytes} > 1000 })
Q: How do I do case-insensitive matching in DSL?

A: DSL doesn't support it. Use code reference:

# Case-insensitive requires code reference
->Where(sub { lc($_[0]{method}) eq 'get' })
Q: Can I use regular expressions in DSL?

A: No. Use code reference:

# Regex requires code reference
->Where(sub { $_[0]{url} =~ m{^/api/} })

Compatibility Questions

Q: Does this work on Perl 5.6?

A: Yes. Tested on Perl 5.005_03 through 5.40+.

Q: Do I need to install any CPAN modules?

A: No. Pure Perl with no dependencies beyond core.

Q: Can I use this on Windows?

A: Yes. Pure Perl works on all platforms.

Q: Why support such old Perl versions?

A: Many production systems cannot upgrade. This module provides modern query capabilities without requiring upgrades.

COOKBOOK

Common Patterns

Find top N by value
->OrderByDescending(sub { $_[0]{score} })
  ->Take(10)
  ->ToArray()
Group and count
->GroupBy(sub { $_[0]{category} })
  ->Select(sub {
      { 
          Category => $_[0]{Key},
          Count => scalar(@{$_[0]{Elements}})
      }
  })
  ->ToArray()
Running total
my $total = 0;
->Select(sub {
    $total += $_[0]{amount};
    { %{$_[0]}, running_total => $total }
})
Pagination
# Page 3, size 20
->Skip(40)->Take(20)->ToArray()
Unique values
->Select(sub { $_[0]{category} })
  ->Distinct()
  ->ToArray()
Conditional aggregation
my $success_avg = $query
    ->Where(status => '200')
    ->Average(sub { $_[0]{response_time} });

my $error_avg = $query
    ->Where(sub { $_[0]{status} >= 400 })
    ->Average(sub { $_[0]{response_time} });

LIMITATIONS AND KNOWN ISSUES

Current Limitations

  • Iterator Consumption

    Query objects can only be consumed once. The iterator is exhausted after terminal operations.

    Workaround: Create new query object or save ToArray() result.

  • No Parallel Execution

    All operations execute sequentially in a single thread.

  • No Index Support

    All filtering requires full scan. No index optimization.

  • GroupBy Sorts Keys

    GroupBy returns groups in sorted key order. Original order is not preserved.

  • Distinct Uses String Keys

    Distinct with custom comparer uses stringified keys. May not work correctly for complex objects.

Not Implemented

The following LINQ methods are NOT implemented in this version:

  • Join, GroupJoin - Relational operations

  • Zip - Combine two sequences

  • Union, Intersect, Except - Additional set operations

  • OfType - Type filtering

  • Cast - Type conversion

  • DefaultIfEmpty - Default value for empty sequence

  • SequenceEqual - Sequence comparison

  • Concat - Sequence concatenation

  • Aggregate - Custom aggregation with seed

  • LongCount - 64-bit count

These may be added in future versions if there is demand.

BUGS

Please report any bugs or feature requests to:

  • Email: ina@cpan.org

SUPPORT

Documentation

Full documentation is available via:

perldoc LTSV::LINQ

CPAN

https://metacpan.org/pod/LTSV::LINQ

SEE ALSO

  • LTSV specification

    http://ltsv.org/

  • Microsoft LINQ documentation

    https://learn.microsoft.com/en-us/dotnet/csharp/linq/

AUTHOR

INABA Hitoshi <ina@cpan.org>

Contributors

Contributions are welcome! Please submit pull requests on GitHub.

ACKNOWLEDGEMENTS

LINQ Technology

This module is inspired by LINQ (Language Integrated Query), which was developed by Microsoft Corporation for the .NET Framework.

LINQ(R) is a registered trademark of Microsoft Corporation.

We are grateful to Microsoft for pioneering the LINQ technology and making it a widely recognized programming pattern. The elegance and power of LINQ has influenced query interfaces across many programming languages, and this module brings that same capability to LTSV data processing in Perl.

This module is not affiliated with, endorsed by, or sponsored by Microsoft Corporation.

References

This module was inspired by:

COPYRIGHT AND LICENSE

Copyright (c) 2026 INABA Hitoshi

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

License Details

This module is released under the same license as Perl itself:

You may choose either license.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENSE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.