NAME
Treex::PML::Schema - Perl implements a PML schema.
DESCRIPTION
This class implements PML schemas. PML schema consists of a set of type declarations of several kinds, represented by objects inheriting from a common base class Treex::PML::Schema::Decl
.
INHERITANCE
This class inherits from Treex::PML::Schema::Template.
Attribute Paths
Some methods use so called 'attribute paths' to navigate through nested and referenced type declarations. An attribute path is a '/'-separated sequence of steps, where step can be one of the following:
!
type-name-
'!' followed by name of a named type (this step can only occur as the very first step
- name
-
name (of a member of a structure, element of a sequence or attribute of a container), specifying the type declaration of the specified named component
#content
-
the string '#content', specifying the content type declaration of a container
LM
-
specifying the type declaration of a list
AM
-
specifying the type declaration of an alt
[
NNN]
-
where NNN is a decimal number (ignored) are an equivalent of LM or AM
Steps of the form LM, AM, and [NNN] (except when occuring at the end of an attribute path) may be omitted.
EXPORT
This module exports constants for declaration types.
EXPORT TAGS
CONSTANTS
See Treex::PML::Schema::Constants.
METHODS
- Treex::PML::Schema->new ({ option => value, ... })
-
NOTE: Don't call this constructor directly, use Treex::PML::Factory->createPMLSchema() instead!
Parses an XML representation of a PML Schema from a string, filehandle, local file, or URL, processing the modular instructions as described in
L<http://ufal.mff.cuni.cz/jazz/PML/doc/pml_doc.html#processing>
and returns the corresponding
Treex::PML::Schema
object.One of the following options must be given:
string
-
a XML string to parse
filename
-
a file name or URL
fh
-
a file-handle (IO::File, IO::Pipe, etc.) open for reading
The following options are optional:
base_url
-
base URL for referred schemas (usefull when parsing from a file-handle or a string)
use_resources
-
if this option is used with a true value, the parser will attempt to locate referred schemas also in Treex::PML resource paths.
revision
,minimal_revision
,maximal_revision
-
constraints to the revision number of the schema.
validate
-
if this option is used with a true value, the parser will validate the schema on the fly using a RelaxNG grammar given using the
relaxng_schema
parameter; ifrelaxng_schema
is not given, the file 'pml_schema_inline.rng' searched for in Treex::PML resource paths is assumed. relaxng_schema
-
a particular RelaxNG grammar to validate against. The value may be an URL or filename for the grammar in the RelaxNG XML format, or a XML::LibXML::RelaxNG object representation. The compact format is not supported.
- Treex::PML::Schema->readFrom (filename,opts)
-
An obsolete alias for Treex::PML::Schema->new({%$opts, filename=>$filename}).
- $schema->write ({option => value})
-
This method serializes the Treex::PML::Schema object to XML. See Treex::PML::Schema::XMLNode->write for implementation.
IMPORTANT: The resulting schema is simplified, that is all modular instructions are processed and removed from it, see http://ufal.mff.cuni.cz/jazz/PML/doc/pml_doc.html#processing
One of the following options must be given:
string
-
a scalar reference to which the XML is to be stored as a string
filename
-
a file name
fh
-
a file-handle (IO::File, IO::Pipe, etc.) open for writing
One of the following options are optional:
no_backups
-
if this option is used with a true value, the writer will not attempt to create backup (tilda) files when overwriting an existing file.
no_indent
-
if this option is used with a true value, the writer will not add additional newlines and indentatin white-space to the result XML.
- $schema->get_url ()
-
Return location of the PML schema file.
- $schema->set_url ($URI)
-
Set location of the PML schema file.
- $schema->get_pml_version ()
-
Return PML version the schema conforms to.
- $schema->get_revision ()
-
Return PML schema revision.
- $schema->get_description ()
-
Return PML schema description.
- $schema->get_root_decl ()
-
Return the root type declaration (see
Treex::PML::Schema::Root
). - $schema->get_root_type ()
-
Like $schema->get_root_decl->get_content_decl.
- $decl->get_decl_type ()
-
Return the constant PML_SCHEMA_DECL (for compatibility with the Treex::PML::Schema::Decl interface).
- $decl->get_decl_type_str ()
-
Return the string 'schema' (for compatibility with the Treex::PML::Schema::Decl interface).
- $schema->get_root_name ()
-
Return name of the root element for PML instance.
- $schema->get_type_names ()
-
Return names of all named type declarations.
- $schema->get_named_references ()
-
This method returns a list of HASHrefs containing information about a named references to PML instances (each hash will currently have the keys 'name' and 'readas').
- $schema->get_named_reference_info (name)
-
This method retrieves information about a specific named instance reference as a hash (currently with keys 'name' and 'readas').
- Treex::PML::Schema::cmp_revisions($A, $B)
-
This function compares two schema revision strings according to the ruls described in the PML specification. Returns -1 if revision $A precedes revision $B, 0 if the revisions are equal (equivalent), and 1 if revision $A follows revision $B.
- $schema->for_each_decl (sub{...})
-
This method traverses all nested declarations and sub-declarations and calls a given subroutine passing the sub-declaration object as a parameter.
- $schema->check_revision({ option=>value })
-
Check that schema revision satisfies given constraints. The following options are suported:
revision
: exact revision number to matchminimal_revision
: minimal revision number to matchmaximal_revision
: maximal revision number to matchrevision error
: an optional error message format string with %f mark for the schema filename or URL and %e for the error string. Defaults to 'Error: wrong schema revision of %f: %e'; - $schema->convert_from_hash
-
Compatibility method building the schema object from a nested hash structure created by XML::Simple which was used in older implementations. This is useful for upgrading objects stored in old binary dumps.
- $schema->find_type_by_path (attribute-path,noresolve,decl)
-
Locate a declaration specified by
attribute-path
starting from declarationdecl
. Ifdecl
is undefined the root type declaration is used. (Note that attribute paths starting with '/' are always evaluated startng from the root declaration and paths starting with '!' followed by a name of a named type are evaluated starting from that type.) All references to named types are transparently resolved in each step.The caller should pass a true value in
noresolve
to enforce Member, Attribute, Element, Type, or Root declaration objects to be returned rather than declarations of their content.Attribute path is a '/'-separated sequence of steps (member, attribute, element names or strings matching [\d*]) which identifying a certain nested type declaration. A step of the aforementioned form [\d*] is match the content declaration of a List or Alt. Note however, that named stepsdive into List or Alt declarations automatically, too.
- $schema->find_types_by_role (role,start_decls)
-
Return a list of declarations (objects derived from Treex::PML::Schema::Decl) that have role equal to
role
.If
start_decls
is specified, it must be an ARRAY reference of declarations; in that case, only declarations nested below the listed ones are considered. - $schema->find_role (role,start_decl,opts)
-
WARINING: this function can be very slow, esp. if the type declarations are recursive.
Return a list of attribute paths leading to nested type declarations of
decl
with role equal torole
.This is equivalent to
$schema->find_decl($decl,sub{ $_[0]->{role} eq $role },$opts);
Please, see the documentation for
find_dec
for more information. - $schema->find_decl (callback,start_decl,opts)
-
WARINING: this function can be very slow, esp. if the type declarations are recursive.
Return a list of attribute paths leading to nested type declarations of
decl
for which a given callback returns a true value. The tested type declaration is passed to the callback as the first (and only) argument.If
start_decls
is specified, it must be an ARRAY reference of declarations; in that case, only declarations nested or referred to from the listed ones are considered.In array context return all matching nested declarations are returned. In scalar context only the first one is returned (with early stopping).
The last argument
opts
can be used to pass some flags to the algorithm. Currently only the flagno_childnodes
is available. If true, then the function never recurses into content declaration of declarations with the role #CHILDNODES. - $schema->node_types ()
-
Return a list of all type declarations with the role
#NODE
. - $schema->get_type_by_name (name)
-
Return the declaration of the named type with a given name (see
Treex::PML::Schema::Type
). - $schema->validate_object (object, type_decl, log, flags)
-
Validates the data content of the given object against a specified type declaration. The type_decl argument must either be an object derived from the
Treex::PML::Schema::Decl
class or the name of a named type.An array reference may be passed as the optional 3rd argument
log
to obtain a detailed report of all validation errors.The
flags
argument can specify flags that influance the validation. The following constants can binary-OR'ed to obtain the fags:PML_VALIDATE_NO_TREES - do not validate nested data with roles #CHIDLNODES or #TREES and do not require that objects with the role #NODE implement the Treex::PML::Node role.
PML_VALIDATE_NO_CHILDNODES - do not validate nested data with the role #CHIDLNODES.
Returns: 1 if the content conforms, 0 otherwise.
- $schema->validate_field (object, attr-path, type, log)
-
This method is similar to
validate_object
, but in this case the validation is restricted to the data substructure ofobject
specified by theattr-path
argument.type
is the type ofobject
specified either by the name of a named type, or as a Treex::PML::Type, or a type declaration.An array reference may be passed as the optional 3rd argument
log
to obtain a detailed report of all validation errors.Returns: 1 if the content conforms, 0 otherwise.
- $schema->get_paths_to_atoms (\@decls, \%opts)
-
This method returns a list of all non-periodic canonical paths leading from given types to atomic values. Currently only the following options are supported:
no_childnodes => $bool
If true, the method does not descent to member types with the role #CHILDNODES.
no_nodes => $bool
If true, the method does not descent to member types with the role #NODE (except for the starting types).
with_LM => $bool
If true, the paths will include a LM step for each List type on the path.
with_AM => $bool
If true, the paths will include a AM step for each Alt type on the path.
with_Seq_brackets => $bool
If true, the paths will append a [0] after each step representing a sequence element
- $schema->attributes (decl...)
-
This function tries to emulate the behavior of
Treex::PML::FSFormat->attributes
to some extent.Return attribute paths to all atomic subtypes of given type declarations. If no type declaration objects are given, then types with role
#NODE
are assumed. This function never descends to subtypes with role#CHILDNODES
. - $schema->post_process($options)
-
Auxiliary method used internally by the PML Schema parser. It simplifies the schema and for each declaration object creates back references to its parent declaration and schema and pre-computes the type attribute path returned by $decl->get_decl_path().
CLASSES FOR TYPE DECLARATIONS
- Treex::PML::Schema::Decl
- Treex::PML::Schema::Root
- Treex::PML::Schema::Type
- Treex::PML::Schema::Struct
- Treex::PML::Schema::Container
- Treex::PML::Schema::Seq
- Treex::PML::Schema::List
- Treex::PML::Schema::Alt
- Treex::PML::Schema::Choice
- Treex::PML::Schema::CDATA
- Treex::PML::Schema::Constant
- Treex::PML::Schema::Member
- Treex::PML::Schema::Element
- Treex::PML::Schema::Attribute
SEE ALSO
Prague Markup Language (PML) format: http://ufal.mff.cuni.cz/jazz/PML/
Tree editor TrEd: http://ufal.mff.cuni.cz/~pajas/tred
Related packages: Treex::PML, Treex::PML::Schema::Template, Treex::PML::Schema::Decl, Treex::PML::Instance,
COPYRIGHT AND LICENSE
Copyright (C) 2006-2010 by Petr Pajas
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.