NAME

File::OM - output multiplexer routines

SYNOPSIS

use File::OM;              # to import routines into a Perl script

$om = File::OM->new(       # make output object that creates strings in
      $format, {           # XML, Turtle, JSON, ANVL, or Plain formats
  outhandle => *STDOUT,    # (opt) print string instead of returning it
  verbose => 1 });         # (opt) also output record and line numbers

$om->ostream();            # open stream

$om->cstream();            # close stream

$om->orec(                 # open record
      $recnum,             # record number (default 1)
      $lineno);            # input line number (default 1)

$om->crec(                 # close record
      $recnum);            # record number (default 1)

$om->elem(                 # output an entire element
      $name,               # string representing element name
      $value,              # string representing element value
      $elemnum,            # element number (default 1)
      $lineno);            # starting input line number (default '1:')

$om->elems(                # output elements; wrap ANVL/Plain/XML lines
      $name,               # string representing first element name
      $value,              # string representing first element value
      ...);                # other element names and values

$om->name_encode($s);      # encode a name
$om->value_encode($s);     # encode a value
$om->comment_encode($s);   # encode a comment or pseudo-comment

om_opt_defaults();         # get hash reference with factory defaults

DESCRIPTION

The OM (Output Multiplexer) Perl module provides a general output formatting framework for data that can be represented as records consisting of elements, values, and comments. Specific conversions are possible to XML, Turtle, JSON, and "Plain" unlabeled text.

The internal element structure is currently identical to the structure returned by File::ANVL::anvl_recarray. The first triple of the returned array is special in that it describes the origin of the record; its elements are

INDEX   NAME        VALUE
  0     format      original format ("ANVL", "JSON", "XML", etc)
  1     <unused>
  2     <unused>

The remaining triples are free form except that the values will have been drawn from the original format and possibly decoded. The first item ("lineno") in each remaining triple is a number followed by a letter, such as "34:" or "6#". The number indicates the line number (or octet offset, depending on the origin format) of the start of the element. The letter is either ':' to indicate a real element or '#' to indicate a comment; if the latter, the element name has no defined meaning and the comment is contatined in the value.

OM presents an object oriented interface. The object constructor takes a format argument and returns undef if the format is unknown. The returned object has methods for creating format-appropriate output corresponding (currently) to five output modes; for a complete application of these methods, see File::ANVL::anvl_om.

Constructor options include 'verbose', which causes the methods to insert record and line numbers as comments or pseudo-comments (e.g., for JSON, an extra element called "#" since JSON doesn't support comments). Normally output is returned as a string, but if the 'outhandle' option (defaults to '') contains a file handle, for example,

{ outhandle => *STDOUT }

the string will be printed to the file handle and the method will return the status of the print call. Constructor options and defaults:

{
outhandle        => '',        # return string instead of printing it
indent_start     => '',        # overall starting indent
indent_step      => '  ',      # how much to increment/decrement indent

# Format specific options.
turtle_indent    => '    ',    # turtle has one indent width
turtle_predns    =>            # turtle predicate namespaces
       'http://purl.org/kernel/elements/1.1/',
turtle_nosubject => 'default', # a default subject (change this)
turtle_subjelpat => '',        # pattern for matching subject element
turtle_stream_prefix => 'erc', # symbol we use for turtle
wrap             => 72,        # wrap text to 72 cols (ANVL, Plain, XML)
xml_stream_name  => 'recs',    # for XML output, stream tag
xml_record_name  => 'rec',     # for XML output, record tag

# Used to maintain object state.
indent           => '',        # current ident
elemsref         => [],        # one array to store record elements
}

In this release of the OM package, objects carry very limited state information. A sense of current indention level is maintained, but there is no stack of "open elements". Right now there is only a "whole element at once" method (elem()) that takes a name and value arguments to construct a complete element. Future releases may support methods for opening and closing elements.

Most method arguments are optional, but for formats that put separators before every element or record except for the first one (e.g., JSON uses commas), the default values will not produce satisfactory results. The $lineno arguments refer to input line numbers that may be useful with the 'verbose' option and creating helpful diagnostic messages.

The caller may elect to use none or all of the methods. It would not be unusual for an application to use only the elem() method for output, especially when another application will be wrapping that output for its own purposes.

SEE ALSO

A Name Value Language (ANVL) http://www.cdlib.org/inside/diglib/ark/anvlspec.pdf

HISTORY

This is a beta version of OM package. It is written in Perl.

AUTHOR

John A. Kunze jak at ucop dot edu

COPYRIGHT AND LICENSE

Copyright 2009-2010 UC Regents. Open source BSD license.

PREREQUISITES

Perl Modules: Text::Wrap

Script Categories:

UNIX : System_administration