NAME

Data::Beacon - BEACON format validating parser and serializer

SYNOPSIS

use Data::Beacon;

$beacon = new SeeAlso::Beacon( $beaconfile );

$beacon->meta();                                   # get all meta fields
$beacon->meta( 'DESCRIPTION' => 'my best links' ); # set meta fields
$d = $beacon->meta( 'DESCRIPTION' );               # get meta field
$beacon->meta( 'DESCRIPTION' => '' );              # unset meta field
print $beacon->metafields();

$beacon->parse(); # proceed parsing links
$beacon->parse( error => sub { print STDERR $_[0] . "\n" } );

$beacon->parse( $beaconfile );
$beacon->parse( \$beaconstring );
$beacon->parse( sub { return $nextline } );

$beacon->count();      # number of parsed links
$beacon->lines();      # number of lines
$beacon->errorcount(); # number of parsing errors

DESCRIPTION

This package implements a parser and serializer for BEACON format with dedicated error handling.

PARSING

You can parse BEACON from a file this way, using a link handler callback:

my $beacon = new SeeAlso::Beacon( $filename );
$beacon->parse( 'link' => \link_handler );
$errors = $beacon->errorcount;

Alternatively you can use the parser as iterator:

my $beacon = new SeeAlso::Beacon( $filename );
while (my $link = $beacon->nextlink()) {
    if (ref($link)) {
        my ($id, $label, $description, $to, $fullid, $fulluri) = @$link;
    } else {
        my $error = $link;
    }
}

Instead of a filename, you can also provide a scalar reference, to parse from a string.

METHODS

new ( [ $from ] { handler => coderef } )

Create a new Beacon object, optionally from a given file. If you specify a source via $from argument or as parameter from => $from, it will be opened for parsing and all meta fields will immediately be read from it. Otherwise you get an empty, but initialized Beacon object. See the parse methods for more details about possible handlers as parameters.

meta ( [ $key [ => $value [ ... ] ] ] )

Get and/or set one or more meta fields. Returns a hash (no arguments), or string or undef (one argument), or croaks on invalid arguments. A meta field can be unset by setting its value to the empty string. The FORMAT field cannot be unset. This method may also croak if a known fields, such as FORMAT, PREFIX, FEED, EXAMPLES, REVISIT, TIMESTAMP is tried to set to an invalid value. Such an error will not change the error counter of this object or modify lasterror.

count

If parsing has been started, returns the number of links, successfully read so far (or zero). If only the meta fields have been parsed, this returns the value of the meta field. In contrast to meta('count'), this method always returns a number. Note that all valid links that could be parsed are included, no matter if processed by a link handler or not.

line

Returns the current line number.

lasterror

Returns the last parsing error message (if any). Errors triggered by directly calling meta are not included. In list context returns a list of error message, line number, and current line content.

errorcount

Returns the number of parsing errors or zero.

metafields

Return all meta fields, serialized and sorted as string. Althugh the order of fields is irrelevant, but this implementation always returns the same fields in same order. To get all meta fields as hash, use the 'meta' method.

parse ( [ $from ] { handler => coderef | pre => $hashref } )

Parse all remaining links (push parsing). If provided a from parameter, this starts a new Beacon. That means the following three are equivalent:

$b = new SeeAlso::Beacon( $from );

$b = new SeeAlso::Beacon( from => $from );

$b = new SeeAlso::Beacon;
$b->parse( $from );

If from is a scalar, it is used as file to parse from. Alternatively you can supply a string reference, or a code reference.

The pre argument can be used to set some meta fields before parsing starts. These fields are cached and reused every time you call parse.

By default, all errors are silently ignored, unless you specifiy an error handler. The last error can be retrieved with the lasterror method and the number of errors by errorcount. Returns true only if errorcount is zero after parsing. Note that some errors may be less important.

Finally, the link handler can be a code reference to a method that is called for each link (that is each line in the input that contains a valid link). The following arguments are passed to the handler:

( $id, $label, $description, $to, $fullid, $fulluri )

Please note that $label, $description, and $to may be the empty string, while $fullid and $fulluri are URIs.

The number of sucessfully parsed links is returned by count.

Errors in link handler and input handler are catched, and produce an error that is given to the error handler.

Read the input stream until the next link and return it (pull parsing). Returns an array reference for a valid link, or undef after end of parsing. This method skips over empty lines and errors, but calls error and link handler, if enabled.

FUNCTIONS

The following functions are exported by default. is automatically exported

beacon ( [ $from ] { handler => coderef } )

Shortcut for Data::Beacon->new. To quickly parse a BEACON file, use:

use Data::Beacon;
beacon($file)->parse();

Parses a line, interpreted as link in BEACON format. Unless a target parameter is given, the last part of the line is used as link destination, if it looks like an URI.

Returns an array reference with four values on success, an empty array reference for empty linkes, an error string on failure, or undef is the supplied line was not defined. This method does not check whether the query identifier is a valid URI, because it may be expanded by a prefix.

Serialize a link and return it as condensed string. You must provide four parameters as string, which all can be the empty string. '|' characters are silently removed. If the $to is not empty but not an URI, or on other errors, the empty string is returned. The $id parameter is not checked whether it is an URI because it may be abbreviated (without PREFIX).

INTERNAL METHODS

If you directly call any of this methods, puppies will die.

_initparams ( [ $from ] { handler => coderef | option => value } )

Initialize parameters as passed to new or parse. Known parameters are from, error, and link (from is not checked here). In addition you cann pass premeta

_startparsing

Open a BEACON file and parse all meta fields. Calling this method will reset the whole object but not the parameters as set with _initparams. If no source had been specified (with parameter from), this is all the method does. If a source is given, it is opened and parsed. Parsing stops when the first non-empty and non-meta field line is encountered. This line is internally stored as lookahead.

_handle_error ( $msg, $lineno, $line )

Internal error handler that calls a custom error handler, increases the error counter and stores the last error.

_readline

Internally read and return a line for parsing afterwards. May trigger an error.

_parseline ( $line )

Internally parse a line and call appropriate handlers etc. Returns a link as array reference, or an error message as string.

DEVELOPMENT

For the latest development snapshot, bug reports, feature requests, and such, visit http://github.com/nichtich/p5-data-beacon

AUTHOR

Jakob Voss <jakob.voss@gbv.de>

LICENSE

Copyright (C) 2010 by Verbundzentrale Goettingen (VZG) and Jakob Voss

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.