NAME
LTSV::LINQ - LINQ-style query interface for LTSV files
VERSION
Version 1.00
SYNOPSIS
use LTSV::LINQ;
# Read LTSV file and query
my @results = LTSV::LINQ->FromLTSV("access.log")
->Where(sub { $_[0]{status} eq '200' })
->Select(sub { $_[0]{url} })
->Distinct()
->ToArray();
# DSL syntax for simple filtering
my @errors = LTSV::LINQ->FromLTSV("access.log")
->Where(status => '404')
->ToArray();
# Grouping and aggregation
my @stats = LTSV::LINQ->FromLTSV("access.log")
->GroupBy(sub { $_[0]{status} })
->Select(sub {
my $g = shift;
return {
Status => $g->{Key},
Count => scalar(@{$g->{Elements}})
};
})
->OrderByDescending(sub { $_[0]{Count} })
->ToArray();
TABLE OF CONTENTS
"METHODS" - Complete method reference (30 methods)
"EXAMPLES" - 8 practical examples
"FEATURES" - Lazy evaluation, method chaining, DSL
"ARCHITECTURE" - Iterator design, execution flow
"PERFORMANCE" - Memory usage, optimization tips
"COMPATIBILITY" - Perl 5.005+ support, pure Perl
"DIAGNOSTICS" - Error messages
"FAQ" - Common questions and answers
"COOKBOOK" - Common patterns
"SEE ALSO" - Related resources
DESCRIPTION
LTSV::LINQ provides a LINQ-style query interface for LTSV (Labeled Tab-Separated Values) files. It offers a fluent, chainable API for filtering, transforming, and aggregating LTSV data.
Key features:
Lazy evaluation - O(1) memory usage for most operations
Method chaining - Fluent, readable query composition
DSL syntax - Simple key-value filtering
30+ LINQ methods - Comprehensive query capabilities
Pure Perl - No XS dependencies
Perl 5.5.3+ - Works on ancient and modern Perl
What is LTSV?
LTSV (Labeled Tab-Separated Values) is a format for structured logs. Each line contains tab-separated key:value pairs.
Example:
time:2026-02-13T10:00:00 status:200 url:/index.html bytes:1024
For more information: http://ltsv.org/
What is LINQ?
LINQ (Language Integrated Query) is a query syntax in C# and .NET. This module brings LINQ-style querying to Perl for LTSV data.
For more information: https://learn.microsoft.com/en-us/dotnet/csharp/linq/
METHODS
Complete Method Reference
This module implements 30 LINQ-style methods organized into 12 categories:
Data Sources (3): From, FromLTSV, Range
Filtering (1): Where (with DSL)
Projection (2): Select, SelectMany
Partitioning (3): Take, Skip, TakeWhile
Ordering (3): OrderBy, OrderByDescending, Reverse
Grouping (1): GroupBy
Set Operations (1): Distinct
Quantifiers (2): All, Any
Element Access (3): First, FirstOrDefault, Last
Aggregation (5): Count, Sum, Min, Max, Average
Conversion (3): ToArray, ToList, ToLTSV
Utility (1): ForEach
Method Summary Table:
Method Category Lazy? Returns
================== ============== ===== ================
From Data Source Yes Query
FromLTSV Data Source Yes Query
Range Data Source Yes Query
Where Filtering Yes Query
Select Projection Yes Query
SelectMany Projection Yes Query
Take Partitioning Yes Query
Skip Partitioning Yes Query
TakeWhile Partitioning Yes Query
OrderBy Ordering No* Query
OrderByDescending Ordering No* Query
Reverse Ordering No* Query
GroupBy Grouping No* Query
Distinct Set Operation Yes Query
All Quantifier No Boolean
Any Quantifier No Boolean
First Element Access No Element
FirstOrDefault Element Access No Element
Last Element Access No* Element
Count Aggregation No Integer
Sum Aggregation No Number
Min Aggregation No Number
Max Aggregation No Number
Average Aggregation No Number
AverageOrDefault Aggregation No Number or undef
ToArray Conversion No Array
ToList Conversion No ArrayRef
ToLTSV Conversion No Boolean
ForEach Utility No Void
* Materializing operation (loads all data into memory)
Data Source Methods
- From(\@array)
-
Create a query from an array.
my $query = LTSV::LINQ->From([{name => 'Alice'}, {name => 'Bob'}]); - FromLTSV($filename)
-
Create a query from an LTSV file.
my $query = LTSV::LINQ->FromLTSV("access.log"); - Range($start, $count)
-
Generate a sequence of integers.
my $query = LTSV::LINQ->Range(1, 10); # 1, 2, ..., 10
Filtering Methods
- Where($predicate)
- Where(key => value, ...)
-
Filter elements. Accepts either a code reference or DSL form.
Code Reference Form:
->Where(sub { $_[0]{status} == 200 }) ->Where(sub { $_[0]{status} >= 400 && $_[0]{bytes} > 1000 })The code reference receives each element as
$_[0]and should return true to include the element, false to exclude it.DSL Form:
The DSL (Domain Specific Language) form provides a concise syntax for simple equality comparisons. All conditions are combined with AND logic.
# Single condition ->Where(status => '200') # Multiple conditions (AND) ->Where(status => '200', method => 'GET') # Equivalent to: ->Where(sub { $_[0]{status} eq '200' && $_[0]{method} eq 'GET' })DSL Specification:
All comparisons are string equality (
eq)All conditions are combined with AND
Undefined values are treated as failures
For numeric or OR logic, use code reference form
Examples:
# DSL: Simple and readable ->Where(status => '200') ->Where(user => 'alice', role => 'admin') # Code ref: Complex logic ->Where(sub { $_[0]{status} >= 400 && $_[0]{status} < 500 }) ->Where(sub { $_[0]{user} eq 'alice' || $_[0]{user} eq 'bob' })
Projection Methods
- Select($selector)
-
Transform each element using the provided selector function.
The selector receives each element as
$_[0]and should return the transformed value.Parameters:
$selector- Code reference that transforms each element
Returns: New query with transformed elements (lazy)
Examples:
# Extract single field ->Select(sub { $_[0]{url} }) # Transform to new structure ->Select(sub { { path => $_[0]{url}, code => $_[0]{status} } }) # Calculate derived values ->Select(sub { $_[0]{bytes} * 8 }) # bytes to bitsNote: Select preserves one-to-one mapping. For one-to-many, use SelectMany.
- SelectMany($selector)
-
Flatten nested sequences into a single sequence.
The selector should return an array reference. All arrays are flattened into a single sequence.
Parameters:
$selector- Code reference returning array reference
Returns: New query with flattened elements (lazy)
Examples:
# Flatten array of arrays my @nested = ([1, 2], [3, 4], [5]); LTSV::LINQ->From(\@nested) ->SelectMany(sub { $_[0] }) ->ToArray(); # (1, 2, 3, 4, 5) # Expand related records ->SelectMany(sub { my $user = shift; return [ map { { user => $user->{name}, role => $_ } } @{$user->{roles}} ]; })Use Cases:
Flattening nested arrays
Expanding one-to-many relationships
Generating multiple outputs per input
Partitioning Methods
- Take($count)
-
Take the first N elements from the sequence.
Parameters:
$count- Number of elements to take (integer >= 0)
Returns: New query limited to first N elements (lazy)
Examples:
# Top 10 results ->OrderByDescending(sub { $_[0]{score} }) ->Take(10) # First record only ->Take(1)->ToArray() # Limit large file processing LTSV::LINQ->FromLTSV("huge.log")->Take(1000)Note: Take(0) returns empty sequence. Negative values treated as 0.
- Skip($count)
-
Skip the first N elements, return the rest.
Parameters:
$count- Number of elements to skip (integer >= 0)
Returns: New query skipping first N elements (lazy)
Examples:
# Skip header row ->Skip(1) # Pagination: page 3, size 20 ->Skip(40)->Take(20) # Skip first batch ->Skip(1000)->ForEach(sub { ... })Use Cases:
Pagination
Skipping header rows
Processing in batches
- TakeWhile($predicate)
-
Take elements while the predicate is true. Stops at first false.
Parameters:
$predicate- Code reference returning boolean
Returns: New query taking elements while predicate holds (lazy)
Examples:
# Take while value is small ->TakeWhile(sub { $_[0]{count} < 100 }) # Take while timestamp is in range ->TakeWhile(sub { $_[0]{time} lt '2026-02-01' }) # Process until error ->TakeWhile(sub { $_[0]{status} < 400 })Important: TakeWhile stops immediately when predicate returns false. It does NOT filter - it terminates the sequence.
# Different from Where: ->TakeWhile(sub { $_[0] < 5 }) # 1,2,3,4 then STOP ->Where(sub { $_[0] < 5 }) # 1,2,3,4 (checks all)
Ordering Methods
- OrderBy($key_selector)
-
Sort in ascending order.
->OrderBy(sub { $_[0]{timestamp} }) - OrderByDescending($key_selector)
-
Sort in descending order.
->OrderByDescending(sub { $_[0]{count} }) - Reverse()
-
Reverse the order.
->Reverse()
Grouping Methods
- GroupBy($key_selector [, $element_selector])
-
Group elements by key.
->GroupBy(sub { $_[0]{status} })Returns array of hashrefs with 'Key' and 'Elements' fields.
Set Operations
Quantifier Methods
- All($predicate)
-
Test if all elements satisfy condition.
->All(sub { $_[0]{status} == 200 }) - Any([$predicate])
-
Test if any element satisfies condition.
->Any(sub { $_[0]{status} >= 400 }) ->Any() # Test if sequence is non-empty
Element Access Methods
- First([$predicate])
-
Get first element. Dies if empty.
->First() ->First(sub { $_[0]{status} == 404 }) - FirstOrDefault([$predicate,] $default)
-
Get first element or default value.
->FirstOrDefault(undef, {}) - Last([$predicate])
-
Get last element. Dies if empty.
->Last()
Aggregation Methods
All aggregation methods are terminal operations - they consume the entire sequence and return a scalar value.
- Count([$predicate])
-
Count the number of elements.
Parameters:
$predicate- (Optional) Code reference to filter elements
Returns: Integer count
Examples:
# Count all ->Count() # 1000 # Count with condition ->Count(sub { $_[0]{status} >= 400 }) # 42 # Equivalent to ->Where(sub { $_[0]{status} >= 400 })->Count()Performance: O(n) - must iterate entire sequence
- Sum([$selector])
-
Calculate sum of numeric values.
Parameters:
$selector- (Optional) Code reference to extract value. Default: identity function
Returns: Numeric sum
Examples:
# Sum of values LTSV::LINQ->From([1, 2, 3, 4, 5])->Sum() # 15 # Sum of field ->Sum(sub { $_[0]{bytes} }) # Sum with transformation ->Sum(sub { $_[0]{price} * $_[0]{quantity} })Note: Non-numeric values may produce warnings. Use numeric context.
- Min([$selector])
-
Find minimum value.
Parameters:
$selector- (Optional) Code reference to extract value
Returns: Minimum value (numeric comparison)
Examples:
# Minimum of values ->Min() # Minimum of field ->Min(sub { $_[0]{response_time} }) # Oldest timestamp ->Min(sub { $_[0]{timestamp} })Returns:
undefif sequence is empty - Max([$selector])
-
Find maximum value.
Parameters:
$selector- (Optional) Code reference to extract value
Returns: Maximum value (numeric comparison)
Examples:
# Maximum of values ->Max() # Maximum of field ->Max(sub { $_[0]{bytes} }) # Latest timestamp ->Max(sub { $_[0]{timestamp} })Returns:
undefif sequence is empty - Average([$selector])
-
Calculate arithmetic mean.
Parameters:
$selector- (Optional) Code reference to extract value
Returns: Numeric average (floating point)
Examples:
# Average of values LTSV::LINQ->From([1, 2, 3, 4, 5])->Average() # 3 # Average of field ->Average(sub { $_[0]{bytes} }) # Average response time ->Average(sub { $_[0]{response_time} })Throws: Dies with "Sequence contains no elements" if empty
Note: Returns floating point. Use
int()for integer result. - AverageOrDefault([$selector])
-
Calculate arithmetic mean, or return undef if sequence is empty.
Parameters:
$selector- (Optional) Code reference to extract value
Returns: Numeric average (floating point), or undef if empty
Examples:
# Safe average - returns undef for empty sequence my @empty = (); my $avg = LTSV::LINQ->From(\@empty)->AverageOrDefault(); # undef # With data LTSV::LINQ->From([1, 2, 3])->AverageOrDefault(); # 2 # With selector ->AverageOrDefault(sub { $_[0]{value} })Note: Unlike Average(), this method never throws an exception.
Conversion Methods
- ToArray()
-
Convert to array.
my @array = $query->ToArray(); - ToList()
-
Convert to array reference.
my $arrayref = $query->ToList(); - ToLTSV($filename)
-
Write to LTSV file.
$query->ToLTSV("output.ltsv");
Utility Methods
EXAMPLES
Basic Filtering
use LTSV::LINQ;
# DSL syntax
my @successful = LTSV::LINQ->FromLTSV("access.log")
->Where(status => '200')
->ToArray();
# Code reference
my @errors = LTSV::LINQ->FromLTSV("access.log")
->Where(sub { $_[0]{status} >= 400 })
->ToArray();
Aggregation
# Count errors
my $error_count = LTSV::LINQ->FromLTSV("access.log")
->Where(sub { $_[0]{status} >= 400 })
->Count();
# Average bytes for successful requests
my $avg_bytes = LTSV::LINQ->FromLTSV("access.log")
->Where(status => '200')
->Average(sub { $_[0]{bytes} });
print "Average bytes: $avg_bytes\n";
Grouping and Ordering
# Top 10 URLs by request count
my @top_urls = LTSV::LINQ->FromLTSV("access.log")
->Where(sub { $_[0]{status} eq '200' })
->GroupBy(sub { $_[0]{url} })
->Select(sub {
my $g = shift;
return {
URL => $g->{Key},
Count => scalar(@{$g->{Elements}}),
TotalBytes => LTSV::LINQ->From($g->{Elements})
->Sum(sub { $_[0]{bytes} })
};
})
->OrderByDescending(sub { $_[0]{Count} })
->Take(10)
->ToArray();
for my $stat (@top_urls) {
printf "%5d requests - %s (%d bytes)\n",
$stat->{Count}, $stat->{URL}, $stat->{TotalBytes};
}
Complex Query Chain
# Multi-step analysis
my @result = LTSV::LINQ->FromLTSV("access.log")
->Where(status => '200') # Filter successful
->Select(sub { $_[0]{bytes} }) # Extract bytes
->Where(sub { $_[0] > 1000 }) # Large responses only
->OrderByDescending(sub { $_[0] }) # Sort descending
->Take(100) # Top 100
->ToArray();
print "Largest 100 successful responses:\n";
print " ", join(", ", @result), "\n";
Lazy Processing of Large Files
# Process huge file with constant memory
LTSV::LINQ->FromLTSV("huge.log")
->Where(sub { $_[0]{level} eq 'ERROR' })
->ForEach(sub {
my $rec = shift;
print "ERROR at $rec->{time}: $rec->{message}\n";
});
Quantifiers
# Check if all requests are successful
my $all_ok = LTSV::LINQ->FromLTSV("access.log")
->All(sub { $_[0]{status} < 400 });
print $all_ok ? "All OK\n" : "Some errors\n";
# Check if any errors exist
my $has_errors = LTSV::LINQ->FromLTSV("access.log")
->Any(sub { $_[0]{status} >= 500 });
print "Server errors detected\n" if $has_errors;
Data Transformation
# Read LTSV, transform, write back
LTSV::LINQ->FromLTSV("input.ltsv")
->Select(sub {
my $rec = shift;
return {
%$rec,
processed => 1,
timestamp => time(),
};
})
->ToLTSV("output.ltsv");
Working with Arrays
# Query in-memory data
my @data = (
{name => 'Alice', age => 30, city => 'Tokyo'},
{name => 'Bob', age => 25, city => 'Osaka'},
{name => 'Carol', age => 35, city => 'Tokyo'},
);
my @tokyo_residents = LTSV::LINQ->From(\@data)
->Where(city => 'Tokyo')
->OrderBy(sub { $_[0]{age} })
->ToArray();
FEATURES
Lazy Evaluation
All query operations use lazy evaluation via iterators. Data is processed on-demand, not all at once.
# Only reads 10 records from file
my @top10 = LTSV::LINQ->FromLTSV("huge.log")
->Take(10)
->ToArray();
Method Chaining
All methods (except terminal operations like ToArray) return a new query object, enabling fluent method chaining.
->Where(...)->Select(...)->OrderBy(...)->Take(10)
DSL Syntax
Simple key-value filtering without code references.
# Readable and concise
->Where(status => '200', method => 'GET')
# Instead of
->Where(sub { $_[0]{status} eq '200' && $_[0]{method} eq 'GET' })
ARCHITECTURE
Iterator-Based Design
LTSV::LINQ uses an iterator-based architecture for lazy evaluation.
Core Concept:
Each query operation returns a new query object wrapping an iterator (a code reference that produces one element per call).
my $iter = sub {
# Read next element
# Apply transformation
# Return element or undef
};
my $query = LTSV::LINQ->new($iter);
Benefits:
Memory Efficiency - O(1) memory for most operations
Lazy Evaluation - Elements computed on-demand
Composability - Iterators chain naturally
Early Termination - Stop processing when done
Method Categories
Lazy Operations (return new query):
These operations return immediately, creating a new query object. No data processing occurs until a terminal operation is called.
Where, Select, SelectMany
Take, Skip, TakeWhile
Distinct
Terminal Operations (return value):
These operations consume the iterator and return a result. All lazy operations are executed at this point.
ToArray, ToList, ToLTSV
Count, Sum, Min, Max, Average
First, FirstOrDefault, Last
All, Any
ForEach
Materializing Operations (return new query, but eager):
These operations must consume the entire input before proceeding.
OrderBy, OrderByDescending, Reverse
GroupBy
Query Execution Flow
# Build query (lazy - no execution yet)
my $query = LTSV::LINQ->FromLTSV("access.log")
->Where(status => '200') # Lazy
->Select(sub { $_[0]{url} }) # Lazy
->Distinct(); # Lazy
# Execute query (terminal operation)
my @results = $query->ToArray(); # Now executes entire chain
Execution Order:
1. FromLTSV opens file and creates iterator
2. Where wraps iterator with filter
3. Select wraps with transformation
4. Distinct wraps with deduplication
5. ToArray pulls elements through chain
Each element flows through the entire chain before the next element is read.
Memory Characteristics
Constant Memory Operations:
Where, Select, SelectMany
Take, Skip, TakeWhile
Distinct (with hash, O(unique elements))
ForEach, Count, Sum, Min, Max, Average
First, FirstOrDefault, Any, All
Linear Memory Operations:
ToArray, ToList (O(n))
OrderBy, OrderByDescending, Reverse (O(n))
GroupBy (O(n))
Last, LastOrDefault (O(n))
PERFORMANCE
Memory Efficiency
Lazy evaluation means memory usage is O(1) for most operations, regardless of input size.
# Processes 1GB file with constant memory
LTSV::LINQ->FromLTSV("1gb.log")
->Where(status => '500')
->ForEach(sub { print $_[0]{url}, "\n" });
Terminal Operations
These operations materialize the entire result set:
ToArray, ToList
OrderBy, OrderByDescending, Reverse
GroupBy
Last
For large datasets, use these operations carefully.
Optimization Tips
Filter early: Place Where clauses first
# Good: Filter before expensive operations ->Where(status => '200')->OrderBy(...)->Take(10) # Bad: Order all data, then filter ->OrderBy(...)->Where(status => '200')->Take(10)Limit early: Use Take to reduce processing
# Process only what you need ->Take(1000)->GroupBy(...)Avoid repeated ToArray: Reuse results
# Bad: Calls ToArray twice my $count = scalar($query->ToArray()); my @items = $query->ToArray(); # Good: Call once, reuse my @items = $query->ToArray(); my $count = scalar(@items);
COMPATIBILITY
Perl Version Support
This module is compatible with Perl 5.00503 and later.
Tested on:
Perl 5.005_03 (released 1999)
Perl 5.6.x
Perl 5.8.x
Perl 5.10.x - 5.40.x
Compatibility Policy
Ancient Perl Support:
This module maintains compatibility with Perl 5.005_03 through careful coding practices:
No use of features introduced after 5.005
use warningscompatibility shim for pre-5.6ourkeyword avoided (5.6+ feature)Three-argument
openused (5.6+ but safe)No Unicode features required
No module dependencies beyond core
Why Perl 5.005 Support?:
Many production systems, especially in enterprise and embedded environments, still run Perl 5.005 or 5.6. This module provides modern query capabilities to these systems without requiring upgrades.
Pure Perl Implementation
No XS Dependencies:
This module is implemented in Pure Perl with no XS (C extensions). Benefits:
Works on any Perl installation
No C compiler required
Easy installation in restricted environments
Consistent behavior across platforms
Simpler debugging and maintenance
Core Module Dependencies
None. This module uses only Perl core features available since 5.005.
No CPAN dependencies required.
DIAGNOSTICS
Error Messages
This module may throw the following exceptions:
From() requires ARRAY reference-
Thrown by From() when the argument is not an array reference.
Example:
LTSV::LINQ->From("string"); # Dies LTSV::LINQ->From([1, 2, 3]); # OK Sequence contains no elements-
Thrown by First(), Last(), or Average() when called on an empty sequence.
Methods that throw this error:
First()
Last()
Average()
To avoid this error, use the OrDefault variants:
FirstOrDefault() - returns undef instead of dying
LastOrDefault() - returns undef instead of dying
AverageOrDefault() - returns undef instead of dying
Example:
my @empty = (); LTSV::LINQ->From(\@empty)->First(); # Dies LTSV::LINQ->From(\@empty)->FirstOrDefault(); # Returns undef No element satisfies the condition-
Thrown by First() or Last() with a predicate when no element matches.
Example:
my @data = (1, 2, 3); LTSV::LINQ->From(\@data)->First(sub { $_[0] > 10 }); # Dies LTSV::LINQ->From(\@data)->FirstOrDefault(sub { $_[0] > 10 }); # Returns undef Cannot open 'filename': ...-
File I/O error when FromLTSV() cannot open the specified file.
Common causes:
File does not exist
Insufficient permissions
Invalid path
Example:
LTSV::LINQ->FromLTSV("/nonexistent/file.ltsv"); # Dies with this error
Methods That May Throw Exceptions
- From($array_ref)
-
Dies if argument is not an array reference.
- FromLTSV($filename)
-
Dies if file cannot be opened.
- First([$predicate])
-
Dies if sequence is empty or no element matches predicate.
Safe alternative: FirstOrDefault()
- Last([$predicate])
-
Dies if sequence is empty or no element matches predicate.
Safe alternative: LastOrDefault()
- Average([$selector])
-
Dies if sequence is empty.
Safe alternative: AverageOrDefault()
Safe Alternatives
For methods that may throw exceptions, use the OrDefault variants:
First() → FirstOrDefault() (returns undef)
Last() → LastOrDefault() (returns undef)
Average() → AverageOrDefault() (returns undef)
Example:
# Unsafe - may die
my $first = LTSV::LINQ->From(\@data)->First();
# Safe - returns undef if empty
my $first = LTSV::LINQ->From(\@data)->FirstOrDefault();
if (defined $first) {
# Process $first
}
FAQ
General Questions
- Q: Why LINQ-style instead of SQL-style?
-
A: LINQ provides:
Method chaining (more Perl-like)
Type safety through code
No string parsing required
Composable queries
- Q: Can I reuse a query object?
-
A: No. Query objects use iterators that can only be consumed once.
# Wrong - iterator consumed by first ToArray my $query = LTSV::LINQ->FromLTSV("file.ltsv"); my @first = $query->ToArray(); # OK my @second = $query->ToArray(); # Empty! Iterator exhausted # Right - create new query for each use my $query1 = LTSV::LINQ->FromLTSV("file.ltsv"); my @first = $query1->ToArray(); my $query2 = LTSV::LINQ->FromLTSV("file.ltsv"); my @second = $query2->ToArray(); - Q: How do I do OR conditions in Where?
-
A: Use code reference form with
||:# OR condition requires code reference ->Where(sub { $_[0]{status} == 200 || $_[0]{status} == 304 }) # DSL only supports AND ->Where(status => '200') # Single condition only - Q: Why does my query seem to run multiple times?
-
A: Some operations require multiple passes:
# This reads the file TWICE my $avg = $query->Average(...); # Pass 1: Calculate my @all = $query->ToArray(); # Pass 2: Collect (iterator reset!) # Save result instead my @all = $query->ToArray(); my $avg = LTSV::LINQ->From(\@all)->Average(...);
Performance Questions
- Q: How can I process a huge file efficiently?
-
A: Use lazy operations and avoid materializing:
# Good - constant memory LTSV::LINQ->FromLTSV("huge.log") ->Where(status => '500') ->ForEach(sub { print $_[0]{message}, "\n" }); # Bad - loads everything into memory my @all = LTSV::LINQ->FromLTSV("huge.log")->ToArray(); - Q: Why is OrderBy slow on large files?
-
A: OrderBy must load all elements into memory to sort them.
# Slow on 1GB file - loads everything ->OrderBy(sub { $_[0]{timestamp} })->Take(10) # Faster - limit before sorting (if possible) ->Where(status => '500')->OrderBy(...)->Take(10) - Q: How do I process files larger than memory?
-
A: Use ForEach or streaming terminal operations:
# Process 100GB file with 1KB memory my $error_count = 0; LTSV::LINQ->FromLTSV("100gb.log") ->Where(sub { $_[0]{level} eq 'ERROR' }) ->ForEach(sub { $error_count++ }); print "Errors: $error_count\n";
DSL Questions
- Q: Can DSL do numeric comparisons?
-
A: No. DSL uses string equality (
eq). Use code reference for numeric:# DSL - string comparison ->Where(status => '200') # $_[0]{status} eq '200' # Code ref - numeric comparison ->Where(sub { $_[0]{status} == 200 }) ->Where(sub { $_[0]{bytes} > 1000 }) - Q: How do I do case-insensitive matching in DSL?
-
A: DSL doesn't support it. Use code reference:
# Case-insensitive requires code reference ->Where(sub { lc($_[0]{method}) eq 'get' }) - Q: Can I use regular expressions in DSL?
-
A: No. Use code reference:
# Regex requires code reference ->Where(sub { $_[0]{url} =~ m{^/api/} })
Compatibility Questions
- Q: Does this work on Perl 5.6?
-
A: Yes. Tested on Perl 5.005_03 through 5.40+.
- Q: Do I need to install any CPAN modules?
-
A: No. Pure Perl with no dependencies beyond core.
- Q: Can I use this on Windows?
-
A: Yes. Pure Perl works on all platforms.
- Q: Why support such old Perl versions?
-
A: Many production systems cannot upgrade. This module provides modern query capabilities without requiring upgrades.
COOKBOOK
Common Patterns
- Find top N by value
-
->OrderByDescending(sub { $_[0]{score} }) ->Take(10) ->ToArray() - Group and count
-
->GroupBy(sub { $_[0]{category} }) ->Select(sub { { Category => $_[0]{Key}, Count => scalar(@{$_[0]{Elements}}) } }) ->ToArray() - Running total
-
my $total = 0; ->Select(sub { $total += $_[0]{amount}; { %{$_[0]}, running_total => $total } }) - Pagination
-
# Page 3, size 20 ->Skip(40)->Take(20)->ToArray() - Unique values
-
->Select(sub { $_[0]{category} }) ->Distinct() ->ToArray() - Conditional aggregation
-
my $success_avg = $query ->Where(status => '200') ->Average(sub { $_[0]{response_time} }); my $error_avg = $query ->Where(sub { $_[0]{status} >= 400 }) ->Average(sub { $_[0]{response_time} });
LIMITATIONS AND KNOWN ISSUES
Current Limitations
Iterator Consumption
Query objects can only be consumed once. The iterator is exhausted after terminal operations.
Workaround: Create new query object or save ToArray() result.
No Parallel Execution
All operations execute sequentially in a single thread.
No Index Support
All filtering requires full scan. No index optimization.
GroupBy Sorts Keys
GroupBy returns groups in sorted key order. Original order is not preserved.
Distinct Uses String Keys
Distinct with custom comparer uses stringified keys. May not work correctly for complex objects.
Not Implemented
The following LINQ methods are NOT implemented in this version:
Join, GroupJoin - Relational operations
Zip - Combine two sequences
Union, Intersect, Except - Additional set operations
OfType - Type filtering
Cast - Type conversion
DefaultIfEmpty - Default value for empty sequence
SequenceEqual - Sequence comparison
Concat - Sequence concatenation
Aggregate - Custom aggregation with seed
LongCount - 64-bit count
These may be added in future versions if there is demand.
BUGS
Please report any bugs or feature requests to:
Email:
ina@cpan.org
SUPPORT
Documentation
Full documentation is available via:
perldoc LTSV::LINQ
CPAN
https://metacpan.org/pod/LTSV::LINQ
SEE ALSO
LTSV specification
http://ltsv.org/
Microsoft LINQ documentation
https://learn.microsoft.com/en-us/dotnet/csharp/linq/
AUTHOR
INABA Hitoshi <ina@cpan.org>
Contributors
Contributions are welcome! Please submit pull requests on GitHub.
ACKNOWLEDGEMENTS
LINQ Technology
This module is inspired by LINQ (Language Integrated Query), which was developed by Microsoft Corporation for the .NET Framework.
LINQ(R) is a registered trademark of Microsoft Corporation.
We are grateful to Microsoft for pioneering the LINQ technology and making it a widely recognized programming pattern. The elegance and power of LINQ has influenced query interfaces across many programming languages, and this module brings that same capability to LTSV data processing in Perl.
This module is not affiliated with, endorsed by, or sponsored by Microsoft Corporation.
References
This module was inspired by:
Microsoft LINQ (Language Integrated Query)
LTSV specification
COPYRIGHT AND LICENSE
Copyright (c) 2026 INABA Hitoshi
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
License Details
This module is released under the same license as Perl itself:
Artistic License 1.0
GNU General Public License version 1 or later
You may choose either license.
DISCLAIMER OF WARRANTY
BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENSE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.