NAME
POE::Filter::Line - serialize and parse terminated records (lines)
SYNOPSIS
#!perl
POE::Session->create(
inline_states
=> {
_start
=>
sub
{
$_
[HEAP]{tailor} = POE::Wheel::FollowTail->new(
Filename
=>
"/var/log/system.log"
,
InputEvent
=>
"got_log_line"
,
Filter
=> POE::Filter::Line->new(),
);
},
got_log_line
=>
sub
{
"Log: $_[ARG0]\n"
;
}
}
);
POE::Kernel->run();
exit
;
DESCRIPTION
POE::Filter::Line parses stream data into terminated records. The default parser interprets newlines as the record terminator, and the default serializer appends network newlines (CR/LF, or "\x0D\x0A") to outbound records.
Record terminators are removed from the data POE::Filter::Line returns.
POE::Filter::Line supports a number of other ways to parse lines. Constructor parameters may specify literal newlines, regular expressions, or that the filter should detect newlines on its own.
PUBLIC FILTER METHODS
POE::Filter::Line's new() method has some interesting parameters.
new
new() accepts a list of named parameters.
In all cases, the data interpreted as the record terminator is stripped from the data POE::Filter::Line returns.
InputLiteral
may be used to parse records that are terminated by some literal string. For example, POE::Filter::Line may be used to parse and emit C-style lines, which are terminated with an ASCII NUL:
my
$c_line_filter
= POE::Filter::Line->new(
InputLiteral
=>
chr
(0),
OutputLiteral
=>
chr
(0),
);
OutputLiteral
allows a filter to put() records with a different record terminator than it parses. This can be useful in applications that must translate record terminators.
Literal
is a shorthand for the common case where the input and output literals are identical. The previous example may be written as:
my
$c_line_filter
= POE::Filter::Line->new(
Literal
=>
chr
(0),
);
An application can also allow POE::Filter::Line to figure out which newline to use. This is done by specifying InputLiteral
to be undef:
my
$whichever_line_filter
= POE::Filter::Line->new(
InputLiteral
=>
undef
,
OutputLiteral
=>
"\n"
,
);
InputRegexp
may be used in place of InputLiteral
to recognize line terminators based on a regular expression. In this example, input is terminated by two or more consecutive newlines. On output, the paragraph separator is "---" on a line by itself.
my
$paragraph_filter
= POE::Filter::Line->new(
InputRegexp
=>
"([\x0D\x0A]{2,})"
,
OutputLiteral
=>
"\n---\n"
,
);
MaxBuffer
sets the maximum amount of data that the filter will hold onto while trying to find a line ending. Defaults to 512 MB.
MaxLength
sets the maximum length of a line. Defaults to 64 MB.
If either the MaxLength
or MaxBuffer
constraint is exceeded, POE::Filter::Line
will throw an exception.
PUBLIC FILTER METHODS
POE::Filter::Line has no additional public methods.
SUBCLASSING
POE::Filter::Line exports the FIRST_UNUSED constant. This points to the first unused element in the $self array reference. Subclasses should store their own data beginning here, and they should export their own FIRST_UNUSED constants to help future subclassers.
SEE ALSO
Please see POE::Filter for documentation regarding the base interface.
The SEE ALSO section in POE contains a table of contents covering the entire POE distribution.
BUGS
The default input newline parser is a regexp that has an unfortunate race condition. First the regular expression:
/(\x0D\x0A?|\x0A\x0D?)/
While it quickly recognizes most forms of newline, it can sometimes detect an extra blank line. This happens when a two-byte newline character is broken between two reads. Consider this situation:
some stream dataCR
LFother stream data
The regular expression will see the first CR without its corresponding LF. The filter will properly return "some stream data" as a line. When the next packet arrives, the leading "LF" will be treated as the terminator for a 0-byte line. The filter will faithfully return this empty line.
It is advised to specify literal newlines or use the autodetect feature in applications where blank lines are significant.
AUTHORS & COPYRIGHTS
Please see POE for more information about authors and contributors.