NAME
Data::Beacon - BEACON format validating parser and serializer
SYNOPSIS
use Data::Beacon;
$beacon = new SeeAlso::Beacon( $beaconfile );
$beacon = beacon( $beaconfile ); # equivalent
$beacon = beacon( { FOO => "bar" } ); # empty Beacon with meta fields
$beacon->meta(); # get all meta fields
$beacon->meta( 'DESCRIPTION' => 'my best links' ); # set meta fields
$d = $beacon->meta( 'DESCRIPTION' ); # get meta field
$beacon->meta( 'DESCRIPTION' => '' ); # unset meta field
print $beacon->metafields();
$beacon->parse(); # proceed parsing links
$beacon->parse( error => 'print' ); # print errors to STDERR
$beacon->parse( error => \&error_handler );
$beacon->parse( $beaconfile );
$beacon->parse( \$beaconstring );
$beacon->parse( sub { return $nextline } );
$beacon->count(); # number of parsed links
$beacon->errorcount(); # number of parsing errors
DESCRIPTION
This package implements a parser and serializer for BEACON format with dedicated error handling. A Beacon, as implemente by Data::Beacon
is a set of links together with some meta fields that describe it. Each link consists of four values source (also refered to as id), label, description, and target, where source and target are mandatory URIs, and label and description are strings, being the empty string by default.
BEACON format is the serialization format for Beacons. It defines a very condense syntax to express links without having to deal much with technical specifications.
See http://meta.wikimedia.org/wiki/BEACON for a more detailed description.
SERIALIZING
To serialize only BEACON meta fields, create a new Beacon object, and set its meta fields (passed to the constructor, or with "meta"). You can then get the meta fields in BEACON format with "metafields":
my $beacon = beacon( { PREFIX => ..., TARGET => ... } );
print $beacon->metafields;
The easiest way to serialize links in BEACON format, is to set your Beacon object's link handler to print
, so each link is directly printed to STDOUT. By setting the error handler also to print
, errors are printed to STDERR.
my $beacon = beacon( \%metafields, errors => 'print', links => 'print' );
print $b->metafields();
while ( ... ) {
$beacon->appendlink( $source, $label, $description, $target );
}
Alternatively you can use the function "plainbeaconlink". In this case you should validate links before printing:
if ( $beacon->appendlink( $source, $label, $description, $target ) ) {
print plainbeaconlink( $beacon->link ) . "\n";
}
PARSING
You can parse BEACON format either as iterator:
my $beacon = beacon( $file );
while ( $beacon->nextlink ) {
my ($source, $label, $description, $target, $sourceuri, $targeturi) = $beacon->link;
...
}
Or by push parsing with handler callbacks:
my $beacon = beacon( $file );
$beacon->parse( 'link' => \link_handler );
$errors = $beacon->errorcount;
Instead of a filename, you can also provide a scalar reference, to parse from a string. The meta fields are parsed immediately:
my $beacon = beacon( $file );
print $beacon->metafields . "\n";
my $errors = $beacon->errorcount;
To quickly parse a BEACON file:
use Data::Beacon;
beacon($file)->parse();
QUERYING
METHODS
new ( [ $from ] { handler => coderef } | $metafields )
Create a new Beacon object, optionally from a given file. If you specify a source via $from
argument or as parameter from => $from
, it will be opened for parsing and all meta fields will immediately be read from it. Otherwise you get an empty, but initialized Beacon object. See the parse
methods for more details about possible handlers as parameters.
meta ( [ $key [ => $value [ ... ] ] ] )
Get and/or set one or more meta fields. Returns a hash (no arguments), or string or undef (one argument), or croaks on invalid arguments. A meta field can be unset by setting its value to the empty string. The FORMAT field cannot be unset. This method may also croak if a known fields, such as FORMAT, PREFIX, FEED, EXAMPLES, REVISIT, TIMESTAMP is tried to set to an invalid value. Such an error will not change the error counter of this object or modify lasterror
.
count
If parsing has been started, returns the number of links, successfully read so far (or zero). If only the meta fields have been parsed, this returns the value of the meta field. In contrast to meta('count')
, this method always returns a number. Note that all valid links that could be parsed are included, no matter if processed by a link handler or not.
line
Returns the current line number or zero.
lasterror
Returns the last parsing error message (if any). Errors triggered by directly calling meta
are not included. In list context returns a list of error message, line number, and current line content.
errorcount
Returns the number of parsing errors or zero.
metafields
Return all meta fields, serialized and sorted as string. Althugh the order of fields is irrelevant, but this implementation always returns the same fields in same order. To get all meta fields as hash, use the meta
method.
parse ( [ $from ] { handler => coderef | option => $value } )
Parse all remaining links (push parsing). If provided a from
parameter, this starts a new Beacon. That means the following three are equivalent:
$b = new SeeAlso::Beacon( $from );
$b = new SeeAlso::Beacon( from => $from );
$b = new SeeAlso::Beacon;
$b->parse( $from );
If from
is a scalar, it is used as file to parse from. Alternatively you can supply a string reference, or a code reference.
The pre
option can be used to set some meta fields before parsing starts. These fields are cached and reused every time you call parse
.
If the mtime
option is given, the TIMESTAMP meta value will be initialized as last modification time of the given file.
By default, all errors are silently ignored, unless you specifiy an error
handler. The last error can be retrieved with the lasterror
method and the number of errors by errorcount
. Returns true only if errorcount
is zero after parsing. Note that some errors may be less important.
Finally, the link
handler can be a code reference to a method that is called for each link (that is each line in the input that contains a valid link). The following arguments are passed to the handler:
$source
-
Link source as given in BEACON format. This may be abbreviated but not the empty string.
$label
-
Label as string. This may be the empty string.
$description
-
Description as string. This may be the empty string.
$target
-
Link target as given in BEACON format. This may be abbreviated or the empty string.
The number of sucessfully parsed links is returned by count
.
Errors in link handler and input handler are catched, and produce an error that is given to the error handler.
nextlink
Read from the input stream until the next link has been parsed. Empty lines and invalid lines are skipped, but the error handler is called on invalid lines. This method can be used for pull parsing. Always returns either the link as list or an empty list if the end of input has been reached.
link
Returns the last valid link, that has been read. The link is returned as list of four values (source, label, description, target) without expansion. Use the "expanded" method to get the link with full URIs.
expanded
Returns the last valid link, that has been read in expanded form. The link is returned as list of four values (source, label, description, target), possibly expanded by the meta fields PREFIX, TARGET/TARGETPREFIX.
expandlink ( $source, $label, $description, $target )
Expand a link, consisting of source (mandatory), and label, description, and target (all optional). Returns the expanded link as array with four values, or an empty list. This method does append the link to the Beacon object, nor call any handlers.
appendline( $line )
Append a line of of BEACON format. This method parses the line, and calls the link handler, or error handler. In scalar context returns whether a link has been read (that can then be accessed with link
). In list context, returns the parsed link as list, or the empty list, if the line could not be parsed.
appendlink ( $source [, $label [, $description [, $target ] ] ] )
Append a link. The link is validated and returned as list of four values. On error the error handler is called and an empty list is returned. On success the link handler is called.
FUNCTIONS
The following functions are exported by default.
beacon ( [ $from ] { handler => coderef } )
Shortcut for Data::Beacon->new
.
plainbeaconlink ( $source, $label, $description, $target )
Serialize a link, consisting of source (mandatory), label, description, and target (all optional) as condensed string in BEACON format. This function does not check whether the arguments form a valid link or not. You can pass a simple link, as returned by the "link" method, or an expanded link, as returned by "expanded".
INTERNAL METHODS
If you directly call any of this methods, puppies will die.
_initparams ( [ $from ] { handler => coderef | option => value } | $metafield )
Initialize parameters as passed to new
or parse
. Known parameters are from
, error
, and link
(from
is not checked here). In addition you cann pass pre
and mtime
as options.
_startparsing
Open a BEACON file and parse all meta fields. Calling this method will reset the whole object but not the parameters as set with _initparams
. If no source had been specified (with parameter from
), this is all the method does. If a source is given, it is opened and parsed. Parsing stops when the first non-empty and non-meta field line is encountered. This line is internally stored as lookahead.
_handle_error ( $msg [, $line ] )
Internal error handler that calls a custom error handler, increases the error counter and stores the last error.
_readline
Internally read and return a line for parsing afterwards. May trigger an error.
_fields
Gets one or more fields, that are strings, which do not contain |
or newlines. The first string is not empty. Returns a reference to an array of four fields.
_expandlink ( $link )
Expand a link, provided as array reference without validation. The link must have four defined, trimmed fields. After expansion, source and target must still be checked whether they are valid URIs.
_is_uri
Check whether a given string is an URI. This function is based on code of Data::Validate::URI, adopted for performance.
DEVELOPMENT
Please visit http://github.com/nichtich/p5-data-beacon for the latest development snapshot, bug reports, feature requests, and such.
SEE ALSO
See also SeeAlso::Server for an API to exchange single sets of beacon links, based on the same source identifier.
AUTHOR
Jakob Voss <jakob.voss@gbv.de>
LICENSE
Copyright (C) 2010 by Verbundzentrale Goettingen (VZG) and Jakob Voss
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.
In addition you may fork this library under the terms of the GNU Affero General Public License.