NAME
File::Raw - File operations using direct system calls
SYNOPSIS
use File::Raw qw(import);
# Slurp entire file
my $content = file_slurp('/path/to/file');
# Write to file
file_spew('/path/to/file', $content);
# Append to file
file_append('/path/to/file', "more data\n");
# Get all lines as array
my $lines = file_lines('/path/to/file');
# Process lines efficiently with callback (line via $_)
file_each_line('/path/to/file', sub {
print "Line: $_\n";
});
# Memory-mapped file access
my $mmap = file_mmap_open('/path/to/file');
my $data = $mmap->data; # Zero-copy access
$mmap->close;
# Line iterator (memory efficient)
my $iter = file_lines_iter('/path/to/file');
while (!$iter->eof) {
my $line = $iter->next;
# process line
}
$iter->close;
# File stat operations
my $size = file_size('/path/to/file');
my $mtime = file_mtime('/path/to/file');
my $exists = file_exists('/path/to/file');
# Type checks
file_is_file('/path/to/file');
file_is_dir('/path/to/dir');
file_is_readable('/path/to/file');
file_is_writable('/path/to/file');
DESCRIPTION
File::Raw provides file operations using direct system calls, bypassing PerlIO overhead. It includes functions for reading/writing files, iterating lines efficiently, memory-mapped access, and file metadata operations. The module also supports a plugin system for custom read/write/transform operations.
Performance
The module uses:
Direct read(2)/write(2) syscalls
Pre-allocated buffers based on file size
Memory-mapped file access for zero-copy reads
Efficient line iteration without loading entire file
MULTICALL optimization for callback-based functions
FUNCTIONS
All functions are available with a file_ prefix when imported, e.g. file_slurp, file_spew, etc.
use File::Raw qw(slurp spew);
file_spew('/path/to/file', "data"); # Write data
my $content = file_slurp('/path/to/file'); # Read data
slurp
my $content = File::Raw::slurp($path);
Read entire file into a scalar. Returns undef on error. Pre-allocates the buffer based on file size for optimal performance.
slurp_raw
my $content = File::Raw::slurp_raw($path);
Same as slurp, explicit binary mode.
spew
my $ok = File::Raw::spew($path, $data);
Write data to file (creates or truncates). Returns true on success.
append
my $ok = File::Raw::append($path, $data);
Append data to file. Returns true on success.
lines
my $lines = File::Raw::lines($path);
Returns arrayref of all lines (without newlines).
each_line
File::Raw::each_line($path, sub {
print "Line: $_\n"; # line available via $_
});
Process each line with a callback. Memory efficient - doesn't load entire file into memory.
lines_iter
my $iter = File::Raw::lines_iter($path);
while (!$iter->eof) {
my $line = $iter->next;
}
$iter->close;
Returns a line iterator object. Without a plugin tail, the iterator streams bytes lazily and is memory-efficient. With a plugin tail (lines_iter($path, plugin => 'csv', ...)) the iterator is eager: the file is slurped and parsed into an AoA at construction time, and next walks the array; the header => 1 and header => [names] options are honoured. For memory-bounded streaming through a plugin use each_line instead.
Note: For maximum performance, prefer each_line() which uses MULTICALL optimization and is significantly faster. Use lines_iter() when you need iterator control (e.g., early exit, multiple iterators).
mmap_open
my $mmap = File::Raw::mmap_open($path);
my $mmap = File::Raw::mmap_open($path, 1); # writable
Memory-map a file. Returns a File::Raw::mmap object.
File::Raw::mmap methods
- data() - Returns the mapped content as a scalar (zero-copy)
- sync() - Flush changes to disk (for writable maps)
- close() - Unmap the file
size
my $bytes = File::Raw::size($path);
Returns file size in bytes, or -1 on error.
mtime
my $epoch = File::Raw::mtime($path);
Returns modification time as epoch seconds, or -1 on error.
exists
if (File::Raw::exists($path)) { ... }
Returns true if path exists.
is_file
if (File::Raw::is_file($path)) { ... }
Returns true if path is a regular file.
is_dir
if (File::Raw::is_dir($path)) { ... }
Returns true if path is a directory.
is_readable
if (File::Raw::is_readable($path)) { ... }
Returns true if path is readable.
is_writable
if (File::Raw::is_writable($path)) { ... }
Returns true if path is writable.
stat
my $st = File::Raw::stat($path);
Returns a hashref with all file attributes in one syscall. This is much more efficient than calling multiple individual functions.
my $st = File::Raw::stat($path);
# $st = {
# size => 12345,
# mtime => 1234567890,
# atime => 1234567890,
# ctime => 1234567890,
# mode => 0644, # Permission bits only
# is_file => 1,
# is_dir => '',
# dev => 16777233,
# ino => 12345,
# nlink => 1,
# uid => 501,
# gid => 20,
# }
Returns undef if stat fails.
File::Raw caches the last stat result for performance. When you call multiple stat-like functions on the same file (size, mtime, is_readable, etc.), only the first call hits the filesystem.
clear_stat_cache
File::Raw::clear_stat_cache(); # Clear entire cache
File::Raw::clear_stat_cache($path); # Clear cache for specific path
Clear the internal stat cache. Use this when an external process may have modified a file, or to force a fresh stat on the next call.
The cache is automatically invalidated when you use File::Raw functions that modify files (spew, append, unlink, touch, chmod, etc.), but external modifications require manual cache clearing.
my $size1 = File::Raw::size($file); # Cached stat
# External process modifies $file...
File::Raw::clear_stat_cache($file); # Clear cache
my $size2 = File::Raw::size($file); # Fresh stat
copy
my $ok = File::Raw::copy($src, $dst);
Copy a file. Uses native copy functions (copyfile on macOS, sendfile on Linux) for optimal performance. Returns true on success.
move
my $ok = File::Raw::move($src, $dst);
Move/rename a file. Uses rename() for same-filesystem moves, falls back to copy+unlink for cross-device moves. Returns true on success.
unlink
my $ok = File::Raw::unlink($path);
Delete a file. Returns true on success.
touch
my $ok = File::Raw::touch($path);
Create an empty file or update timestamps. Returns true on success.
mkdir
my $ok = File::Raw::mkdir($path);
my $ok = File::Raw::mkdir($path, $mode);
Create a directory. Default mode is 0755. Returns true on success.
rmdir
my $ok = File::Raw::rmdir($path);
Remove an empty directory. Returns true on success.
readdir
my $entries = File::Raw::readdir($path);
Returns arrayref of directory entries (excludes . and ..).
basename
my $name = File::Raw::basename($path);
Returns the filename portion of a path.
dirname
my $dir = File::Raw::dirname($path);
Returns the directory portion of a path.
extname
my $ext = File::Raw::extname($path);
Returns the file extension including the dot (e.g., ".txt").
join
my $path = File::Raw::join($part1, $part2, ...);
Join path components with the appropriate separator. Handles leading/trailing slashes correctly.
my $path = File::Raw::join('/usr', 'local', 'bin');
# Returns: /usr/local/bin
atime
my $epoch = File::Raw::atime($path);
Returns access time as epoch seconds, or -1 on error.
ctime
my $epoch = File::Raw::ctime($path);
Returns inode change time as epoch seconds, or -1 on error.
mode
my $mode = File::Raw::mode($path);
Returns the file permission bits (e.g., 0644), or -1 on error.
is_link
if (File::Raw::is_link($path)) { ... }
Returns true if path is a symbolic link.
is_executable
if (File::Raw::is_executable($path)) { ... }
Returns true if path is executable.
chmod
my $ok = File::Raw::chmod($path, $mode);
Change file permissions. Returns true on success.
File::Raw::chmod($path, 0755);
head
my $lines = File::Raw::head($path); # First 10 lines
my $lines = File::Raw::head($path, 20); # First 20 lines
Returns arrayref of first N lines (default 10).
tail
my $lines = File::Raw::tail($path); # Last 10 lines
my $lines = File::Raw::tail($path, 20); # Last 20 lines
Returns arrayref of last N lines (default 10).
range_lines
my $lines = File::Raw::range_lines($path, $from, $count);
Returns arrayref of $count lines starting at line $from. 1-based: range_lines($p, 5, 3) returns lines 5, 6, 7. range_lines($p, 1, 10) is equivalent to head($p, 10).
If $from is past EOF, or $count <= 0, or $from < 1, returns an empty arrayref. If fewer than $count lines remain after $from, returns whatever is available (no error).
Accepts the standard plugin tail; with plugin => 'csv' (or any plugin returning AoA) the range is applied to the parsed records:
# Rows 100..149 of a CSV
my $page = File::Raw::range_lines($p, 100, 50,
plugin => 'csv', header => 1);
Same eager trade-off as lines_iter with a plugin: the file is slurped and parsed in full before the slice is taken. For memory-bounded streaming through a plugin use each_line with a counter and die to bail.
atomic_spew
my $ok = File::Raw::atomic_spew($path, $data);
Write data to a temporary file then atomically rename. This ensures the file is never in a partial state. Returns true on success.
grep_lines
my $lines = File::Raw::grep_lines($path, \&predicate);
my $lines = File::Raw::grep_lines($path, 'not_blank');
Filter lines matching a predicate. The predicate can be a coderef or a registered predicate name.
Built-in predicates: blank, not_blank, empty, not_empty, comment, not_comment
# Using coderef
my $lines = File::Raw::grep_lines($path, sub { /pattern/ });
# Using built-in predicate
my $lines = File::Raw::grep_lines($path, 'not_blank');
count_lines
my $count = File::Raw::count_lines($path);
my $count = File::Raw::count_lines($path, \&predicate);
my $count = File::Raw::count_lines($path, 'not_blank');
Count lines in a file. Optionally filter by predicate.
find_line
my $line = File::Raw::find_line($path, \&predicate);
my $line = File::Raw::find_line($path, 'not_blank');
Find the first line matching a predicate. Returns undef if not found.
map_lines
my $results = File::Raw::map_lines($path, \&transform);
Transform each line with a callback, returns arrayref of results.
my $lengths = File::Raw::map_lines($path, sub { length($_) });
register_predicate
File::Raw::register_predicate($name, \&predicate);
Register a custom named predicate for use with grep_lines / count_lines / find_line. The coderef receives the line in $_.
File::Raw::register_predicate('has_error', sub { /ERROR/ });
my $errors = File::Raw::grep_lines($path, 'has_error');
list_predicates
my $names = File::Raw::list_predicates();
Returns arrayref of registered predicate names (built-ins plus any custom ones).
PLUGINS
Most read / write / iteration functions accept a plugin tail:
File::Raw::slurp($path, plugin => 'csv', sep => ';', header => 1);
File::Raw::spew ($path, $rows, plugin => 'csv');
File::Raw::each_line($path, sub { ... }, plugin => 'csv');
The tail is parsed as key => value pairs; the plugin key is mandatory whenever options are supplied. The named plugin must be registered via "register_plugin" (Perl) or file_register_plugin() (C, see "XS API") before the call.
The following functions are plugin-aware:
Read:
slurp,lines,head,tail,range_linesWrite:
spew,append,atomic_spewStreaming:
each_lineIterator:
lines_iterRecord-derived:
grep_lines,count_lines,find_line,map_lines
slurp_raw and the stat / dir / path families are intentionally plugin-free.
lines_iter with a plugin tail is eager (it slurps the file once into an AoA at construction time and the iterator walks that array). The iterator interface is preserved so callers can still store the handle, call next/eof/close, etc., but it is not memory-bounded: for true streaming over huge files use each_line with the same plugin tail.
Plugin chains
The plugin value can be an arrayref of plugin names instead of a single name. The chain describes the file's encoding stack from outermost wrapper to innermost format; same spelling for both directions.
# data.csv.gz: gzip wraps csv. Slurp unwraps left-to-right -
# gunzip first, then parse csv - and returns an AoA.
my $rows = File::Raw::slurp($path,
plugin => ['gzip', 'csv']);
# spew applies right-to-left: csv-encode the AoA into bytes,
# then gzip the bytes, then write the result.
File::Raw::spew($path, $rows,
plugin => ['gzip', 'csv']);
The single-plugin scalar form (plugin => 'csv') keeps its current semantics exactly; chains are purely additive.
Per-plugin options
When the chain has more than one plugin, give each one its own sub-hash. Keys outside any sub-hash are shared across the whole chain (visible to every plugin); per-plugin keys win on conflict.
File::Raw::slurp($path,
plugin => ['gzip', 'csv'],
gzip => { level => 9 }, # only gzip sees this
csv => { sep => ';' }, # only csv sees this
strict => 1, # both gzip and csv see this
);
The single-plugin scalar form takes a flat options bag (top-level keys go straight to the lone plugin) - no sub-hash required.
Type contract
File::Raw doesn't statically enforce the chain's type contract; each plugin sees its predecessor's return value verbatim. The convention is:
For READ: every plugin except the last must return bytes. The last plugin can return any shape (bytes, AoA, AoH, ...).
For WRITE: every plugin except the first must accept bytes; the first sees the user's payload (which may itself be structured).
In practice that means structured-output plugins (csv, json, yaml) belong last in a READ chain and first in a WRITE chain. Byte-transform plugins (gzip, base64, encoding) are chain-friendly anywhere.
Phase coverage
Chains are supported for READ and WRITE only. The record-derived helpers (grep_lines, count_lines, find_line, map_lines) get chain support transparently because they slurp + transform via READ before iterating records.
each_line (the true streaming path) rejects arrayref plugin values: composing two streams needs a record-to-chunk adapter that's its own design problem. Pass a single plugin name there. record phase is also single-plugin only - chaining record functions would require records to keep the same shape across links.
Plugin-author notes
Existing plugins keep working without recompilation: FilePlugin and FilePluginContext are unchanged. A plugin's read/write callback is invoked the same way whether it's standalone or part of a chain; the dispatcher builds a per-iteration ctx->options HV that contains the shared keys overlaid with the plugin's own sub-hash (if any).
register_plugin
File::Raw::register_plugin($name, \%phases);
File::Raw::register_plugin($name, \%phases, $override);
Register a plugin that will be invoked when callers pass plugin => $name. %phases is a hashref of coderefs keyed by phase name. A plugin may implement any subset of phases; absent ones cause a clear error if the user requests them.
File::Raw::register_plugin('csv', {
read => sub { my ($path, $bytes, $opts) = @_; ... }, # bytes -> AoA
write => sub { my ($path, $rows, $opts) = @_; ... }, # rows -> bytes
record => sub { my ($path, $record, $opts) = @_; ... }, # transform/filter
});
The stream phase is intentionally not exposed from Perl - per-chunk call_sv overhead defeats the purpose of streaming. Plugins that need record-by-record callbacks should implement record; File::Raw drives the iteration itself. Streaming plugins must be written in C.
Re-registering a name without $override croaks; pass a true $override to replace.
unregister_plugin
File::Raw::unregister_plugin($name);
Remove a previously-registered plugin.
list_plugins
my $names = File::Raw::list_plugins();
Returns arrayref of currently registered plugin names. The built-in 'predicate' plugin is always present.
The built-in 'predicate' plugin
Boot-time-registered C plugin that owns the eight built-in line predicates (blank/is_blank, not_blank/is_not_blank, empty/is_empty, not_empty/is_not_empty, comment/is_comment, not_comment/is_not_comment) plus any predicate added via "register_predicate". The legacy 2-arg form
File::Raw::grep_lines($path, 'is_blank');
is sugar for going through this plugin.
IMPORT STYLE
use File::Raw qw(:all); # Import all functions as file_*
use File::Raw qw(slurp spew lines); # Import specific functions
use File::Raw qw(import); # Same as :all (backwards compat)
When imported, the functions are installed with `file_` prefix and use custom ops for maximum performance (eliminating function call overhead).
use File::Raw qw(slurp spew);
my $content = file_slurp($path);
file_spew($path, $data);
Available imports: slurp, slurp_raw, spew, append, atomic_spew, lines, exists, size, mtime, atime, ctime, mode, is_file, is_dir, is_link, is_readable, is_writable, is_executable, unlink, mkdir, rmdir, touch, copy, move, chmod, readdir, basename, dirname, extname, clear_stat_cache.
PERFORMANCE NOTES
Platform Optimizations
macOS: Uses copyfile() for native file copying
Linux: Uses sendfile() for zero-copy file transfer
Linux/BSD: Uses posix_fadvise() to hint sequential reads
When to use File::Raw::stat
If you need multiple attributes from a file (size, mtime, is_file, etc.), use File::Raw::stat() instead of calling individual functions:
# SLOW: 5 syscalls
my $size = File::Raw::size($path);
my $mtime = File::Raw::mtime($path);
my $is_file = File::Raw::is_file($path);
# FAST: 1 syscall
my $st = File::Raw::stat($path);
my ($size, $mtime, $is_file) = @{$st}{qw(size mtime is_file)};
XS API
File::Raw exposes a plugin C API via include/file_plugin.h. Downstream XS modules can register C-level plugins that File::Raw's read / write / streaming dispatch routes calls into - no per-record call_sv overhead. The shared object is loaded with RTLD_GLOBAL so symbols resolve at load time without an explicit link step on Linux/macOS.
Types
- FilePluginPhase
-
FILE_PLUGIN_PHASE_READ /* whole-file slurp transform */ FILE_PLUGIN_PHASE_WRITE /* whole-file spew/append transform */ FILE_PLUGIN_PHASE_RECORD /* per-record dispatch */ FILE_PLUGIN_PHASE_STREAM /* chunked feed for streaming */ - FilePluginContext
-
Per-call dispatch context (lifetime: single dispatch call).
typedef struct FilePluginContext { const char *path; /* file path */ SV *data; /* read: bytes; write: payload */ SV *callback; /* per-record cb (stream phase) */ HV *options; /* per-call opts; mortal; never NULL*/ int phase; int cancel; /* set non-zero to cancel op */ void *plugin_state; /* opaque, copied from plugin->state*/ } FilePluginContext; - FilePlugin
-
Registration block; the caller owns the storage (typically a file-scope static) and must keep it alive for as long as the plugin is registered.
typedef struct FilePlugin { const char *name; file_plugin_read_fn read_fn; /* NULL if not implemented */ file_plugin_write_fn write_fn; file_plugin_record_fn record_fn; file_plugin_stream_fn stream_fn; void *state; } FilePlugin;Phase signatures:
typedef SV* (*file_plugin_read_fn) (pTHX_ FilePluginContext *ctx); typedef SV* (*file_plugin_write_fn) (pTHX_ FilePluginContext *ctx); typedef SV* (*file_plugin_record_fn) (pTHX_ FilePluginContext *ctx, SV *record); typedef int (*file_plugin_stream_fn) (pTHX_ FilePluginContext *ctx, const char *chunk, size_t len, int eof);
Functions
- file_register_plugin
-
int file_register_plugin(pTHX_ const FilePlugin *plugin);Returns 1 on success, 0 if a plugin with the same name is already registered (use
file_unregister_pluginfirst), -1 on invalid input (NULL plugin, NULL/empty name). Call during module initialisation only (not thread-safe). - file_unregister_plugin
-
int file_unregister_plugin(pTHX_ const char *name);Remove a plugin by name. Returns 1 if found and removed.
- file_lookup_plugin
-
const FilePlugin *file_lookup_plugin(pTHX_ const char *name);Look up a plugin by name. Returns the registered struct or NULL.
- file_plugin_dispatch_read / file_plugin_dispatch_write / file_plugin_dispatch_stream / file_plugin_dispatch_record
-
SV* file_plugin_dispatch_read (pTHX_ HV *opts, const char *path, SV *bytes); SV* file_plugin_dispatch_write (pTHX_ HV *opts, const char *path, SV *payload); SV* file_plugin_dispatch_stream(pTHX_ HV *opts, const char *path, SV *cb); SV* file_plugin_dispatch_record(pTHX_ HV *opts, const char *path, SV *record);Each helper extracts the
pluginkey fromopts, looks up the plugin (croaks if unknown), confirms the requested phase function pointer is non-NULL (croaks otherwise), builds aFilePluginContexton the stack, and invokes the phase function. These are the functions File::Raw's own XSUBs call - downstream modules normally don't need to call them directly.
Example (downstream XS module)
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include <file_plugin.h>
static SV* upper_read(pTHX_ FilePluginContext *ctx) {
STRLEN len;
char *src = SvPV(ctx->data, len);
SV *out = newSVpvn(src, len);
char *dst = SvPVX(out);
STRLEN i;
for (i = 0; i < len; i++)
if (dst[i] >= 'a' && dst[i] <= 'z')
dst[i] -= 32;
return out;
}
static FilePlugin upper_plugin = {
"upper",
upper_read, NULL, NULL, NULL,
NULL
};
MODULE = MyModule PACKAGE = MyModule
BOOT:
file_register_plugin(aTHX_ &upper_plugin);
After use MyModule, callers can write File::Raw::slurp($path, plugin => 'upper') and File::Raw routes the slurped bytes through upper_read.
AUTHOR
LNATION <email@lnation.org>
BUGS
Please report any bugs or feature requests to bug-file-fast at rt.cpan.org, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=File-Fast. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc File::Raw
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
Search CPAN
LICENSE AND COPYRIGHT
This software is Copyright (c) 2026 by LNATION <email@lnation.org>.
This is free software, licensed under:
The Artistic License 2.0 (GPL Compatible)