NAME

Data::Beacon - BEACON format validating parser and serializer

SYNOPSIS

use Data::Beacon;

$beacon = new SeeAlso::Beacon( $beaconfile );
$beacon = beacon( $beaconfile ); # equivalent

$beacon = beacon( { FOO => "bar" } ); # empty Beacon with meta fields

$beacon->meta();                                   # get all meta fields
$beacon->meta( 'DESCRIPTION' => 'my best links' ); # set meta fields
$d = $beacon->meta( 'DESCRIPTION' );               # get meta field
$beacon->meta( 'DESCRIPTION' => '' );              # unset meta field
print $beacon->metafields();

$beacon->parse(); # proceed parsing links

$beacon->parse( error => 'print' );          # print errors to STDERR
$beacon->parse( error => \&error_handler );

$beacon->parse( $beaconfile );
$beacon->parse( \$beaconstring );
$beacon->parse( sub { return $nextline } );

$beacon->count();      # number of parsed links
$beacon->errorcount(); # number of parsing errors

DESCRIPTION

This package implements a parser and serializer for BEACON format with dedicated error handling. A Beacon, as implemente by Data::Beacon is a set of links together with some meta fields that describe it. Each link consists of four values source (also refered to as id), label, description, and target, where source and target are mandatory URIs, and label and description are strings, being the empty string by default.

BEACON format is the serialization format for Beacons. It defines a very condense syntax to express links without having to deal much with technical specifications.

See http://meta.wikimedia.org/wiki/BEACON for a more detailed description.

SERIALIZING

To serialize only BEACON meta fields, create a new Beacon object, and set its meta fields (passed to the constructor, or with "meta"). You can then get the meta fields in BEACON format with "metafields":

my $beacon = beacon( { PREFIX => ..., TARGET => ... } );
print $beacon->metafields;

The easiest way to serialize links in BEACON format, is to set your Beacon object's link handler to print, so each link is directly printed to STDOUT. By setting the error handler also to print, errors are printed to STDERR.

my $beacon = beacon( \%metafields, errors => 'print', links => 'print' );
print $b->metafields();

while ( ... ) {
    $beacon->appendlink( $source, $label, $description, $target );
}

Alternatively you can use the function "plainbeaconlink". In this case you should validate links before printing:

if ( $beacon->appendlink( $source, $label, $description, $target ) ) {
    print plainbeaconlink( $beacon->link ) . "\n";
}

PARSING

You can parse BEACON format either as iterator:

my $beacon = beacon( $file );
while ( $beacon->nextlink ) {
    my ($source, $label, $description, $target, $sourceuri, $targeturi) = $beacon->link;
    ...
}

Or by push parsing with handler callbacks:

my $beacon = beacon( $file );
$beacon->parse( 'link' => \link_handler );
$errors = $beacon->errorcount;

Instead of a filename, you can also provide a scalar reference, to parse from a string. The meta fields are parsed immediately:

my $beacon = beacon( $file );
print $beacon->metafields . "\n";
my $errors = $beacon->errorcount;

To quickly parse a BEACON file:

use Data::Beacon;
beacon($file)->parse();

QUERYING

METHODS

new ( [ $from ] { handler => coderef } | $metafields )

Create a new Beacon object, optionally from a given file. If you specify a source via $from argument or as parameter from => $from, it will be opened for parsing and all meta fields will immediately be read from it. Otherwise you get an empty, but initialized Beacon object. See the parse methods for more details about possible handlers as parameters.

meta ( [ $key [ => $value [ ... ] ] ] )

Get and/or set one or more meta fields. Returns a hash (no arguments), or string or undef (one argument), or croaks on invalid arguments. A meta field can be unset by setting its value to the empty string. The FORMAT field cannot be unset. This method may also croak if a known fields, such as FORMAT, PREFIX, FEED, EXAMPLES, REVISIT, TIMESTAMP is tried to set to an invalid value. Such an error will not change the error counter of this object or modify lasterror.

count

If parsing has been started, returns the number of links, successfully read so far (or zero). If only the meta fields have been parsed, this returns the value of the meta field. In contrast to meta('count'), this method always returns a number. Note that all valid links that could be parsed are included, no matter if processed by a link handler or not.

line

Returns the current line number or zero.

lasterror

Returns the last parsing error message (if any). Errors triggered by directly calling meta are not included. In list context returns a list of error message, line number, and current line content.

errorcount

Returns the number of parsing errors or zero.

metafields

Return all meta fields, serialized and sorted as string. Althugh the order of fields is irrelevant, but this implementation always returns the same fields in same order. To get all meta fields as hash, use the meta method.

parse ( [ $from ] { handler => coderef | option => $value } )

Parse all remaining links (push parsing). If provided a from parameter, this starts a new Beacon. That means the following three are equivalent:

$b = new SeeAlso::Beacon( $from );

$b = new SeeAlso::Beacon( from => $from );

$b = new SeeAlso::Beacon;
$b->parse( $from );

If from is a scalar, it is used as file to parse from. Alternatively you can supply a string reference, or a code reference.

The pre option can be used to set some meta fields before parsing starts. These fields are cached and reused every time you call parse.

If the mtime option is given, the TIMESTAMP meta value will be initialized as last modification time of the given file.

By default, all errors are silently ignored, unless you specifiy an error handler. The last error can be retrieved with the lasterror method and the number of errors by errorcount. Returns true only if errorcount is zero after parsing. Note that some errors may be less important.

Finally, the link handler can be a code reference to a method that is called for each link (that is each line in the input that contains a valid link). The following arguments are passed to the handler:

$source

Link source as given in BEACON format. This may be abbreviated but not the empty string.

$label

Label as string. This may be the empty string.

$description

Description as string. This may be the empty string.

$target

Link target as given in BEACON format. This may be abbreviated or the empty string.

The number of sucessfully parsed links is returned by count.

Errors in link handler and input handler are catched, and produce an error that is given to the error handler.

Read from the input stream until the next link has been parsed. Empty lines and invalid lines are skipped, but the error handler is called on invalid lines. This method can be used for pull parsing. Always returns either the link as list or an empty list if the end of input has been reached.

Returns the last valid link, that has been read. The link is returned as list of four values (source, label, description, target) without expansion. Use the "expanded" method to get the link with full URIs.

expanded

Returns the last valid link, that has been read in expanded form. The link is returned as list of four values (source, label, description, target), possibly expanded by the meta fields PREFIX, TARGET/TARGETPREFIX.

Expand a link, consisting of source (mandatory), and label, description, and target (all optional). Returns the expanded link as array with four values, or an empty list. This method does append the link to the Beacon object, nor call any handlers.

appendline( $line )

Append a line of of BEACON format. This method parses the line, and calls the link handler, or error handler. In scalar context returns whether a link has been read (that can then be accessed with link). In list context, returns the parsed link as list, or the empty list, if the line could not be parsed.

Append a link. The link is validated and returned as list of four values. On error the error handler is called and an empty list is returned. On success the link handler is called.

FUNCTIONS

The following functions are exported by default.

beacon ( [ $from ] { handler => coderef } )

Shortcut for Data::Beacon->new.

Serialize a link, consisting of source (mandatory), label, description, and target (all optional) as condensed string in BEACON format. This function does not check whether the arguments form a valid link or not. You can pass a simple link, as returned by the "link" method, or an expanded link, as returned by "expanded".

INTERNAL METHODS

If you directly call any of this methods, puppies will die.

_initparams ( [ $from ] { handler => coderef | option => value } | $metafield )

Initialize parameters as passed to new or parse. Known parameters are from, error, and link (from is not checked here). In addition you cann pass pre and mtime as options.

_startparsing

Open a BEACON file and parse all meta fields. Calling this method will reset the whole object but not the parameters as set with _initparams. If no source had been specified (with parameter from), this is all the method does. If a source is given, it is opened and parsed. Parsing stops when the first non-empty and non-meta field line is encountered. This line is internally stored as lookahead.

_handle_error ( $msg [, $line ] )

Internal error handler that calls a custom error handler, increases the error counter and stores the last error.

_readline

Internally read and return a line for parsing afterwards. May trigger an error.

_fields

Gets one or more fields, that are strings, which do not contain | or newlines. The first string is not empty. Returns a reference to an array of four fields.

_expandlink ( $link )

Expand a link, provided as array reference without validation. The link must have four defined, trimmed fields. After expansion, source and target must still be checked whether they are valid URIs.

_is_uri

Check whether a given string is an URI. This function is based on code of Data::Validate::URI, adopted for performance.

DEVELOPMENT

Please visit http://github.com/nichtich/p5-data-beacon for the latest development snapshot, bug reports, feature requests, and such.

SEE ALSO

See also SeeAlso::Server for an API to exchange single sets of beacon links, based on the same source identifier.

AUTHOR

Jakob Voss <jakob.voss@gbv.de>

LICENSE

Copyright (C) 2010 by Verbundzentrale Goettingen (VZG) and Jakob Voss

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

In addition you may fork this library under the terms of the GNU Affero General Public License.