=head1 NAME
XML::Compile - Compilation based XML processing
=head1 SYNOPSIS
# See XML::Compile::Schema
=head1 DESCRIPTION
Many professional applications which process data-centric XML do that
based on a formal specification, expressed as XML Schema. XML::Compile
reads and writes XML data with the help of such schema's. On the Perl
side, the module uses a tree of nested hashes with the same structure.
Where other Perl modules, like SOAP::WSDL help you using these schema's
(often with a lot of run-time [XPath] searches), this module takes a
different approach: in stead of run-time processing of the specification,
it will first compile the expected structure into real Perl, and then use
that to process the data.
There are many perl modules with the same intention as this one: translate
between XML and nested hashes. However, there are a few serious
differences: because the schema is used here (and not in the other
modules), we can validate the data. XML requires validation. Next to
this, data-types are formatted and processed correctly. for instance,
the specification prescribes that the C<integer> data-type must accept
huge values of at least 18 digits. Also more complex data-types like
C<list>, C<union>, and C<substitutionGroup> (unions on complex type level)
are supported, which is rarely the case in other modules.
In general two WARNINGS:
=over 4
=item .
The compiler is implemented in L<XML::Compile::Schema::Translate|XML::Compile::Schema::Translate>,
which is B<not finished>. See that manual page about the specific behavior
and its (current) limitations! Please help to find missing pieces and
mistakes.
=item .
The provided B<schema is not validated>! In some cases,
compile-time and run-time errors will be reported, but typically only
in cases that the parser has no idea what to do with such a mistake.
On the other hand, the processed B<data is validated>: the output will
follow the specs closely.
=back
=head1 METHODS
=head2 Constructors
These constructors are base class methods to be extended,
and therefore should not be accessed directly.
$obj-E<gt>B<new>(TOP, OPTIONS)
=over 4
The TOP is the source of XML. See L<dataToXML()|XML::Compile/"Read XML"> for valid options.
If you have compiled/collected all readers and writers you need,
you may simply terminate the compiler object: that will clean-up
(most of) the XML::LibXML objects.
Option --Defined in--Default
schema_dirs undef
. schema_dirs DIRECTORY|ARRAY-OF-DIRECTORIES
=over 4
Where to find schema's. This can be specified with the
environment variable C<SCHEMA_DIRECTORIES> or with this option.
See L<addSchemaDirs()|XML::Compile/"Accessors"> for a detailed explanation.
=back
=back
=head2 Accessors
$obj-E<gt>B<addSchemaDirs>(DIRECTORIES)
=over 4
Each time this method is called, the specified DIRECTORIES will be added
in front of the list of already known schema directories. Initially,
the value of the environment variable C<SCHEMA_DIRECTORIES> is added
(therefore used last), then the constructor option C<schema_dirs>
is processed.
Values which are C<undef> are skipped. ARRAYs are flattened.
Arguments are split on colons (only when on UNIX) after flattening.
=back
$obj-E<gt>B<findSchemaFile>(FILENAME)
=over 4
Runs through all defined schema directories (see L<addSchemaDirs()|XML::Compile/"Accessors">)
in search of the specified FILENAME. When the FILENAME is absolute,
that will be used, and no search will take place. An C<undef> is returned
when the file is not found or not readible, otherwise a full path to
the file is returned to the caller.
=back
$obj-E<gt>B<knownNamespace>(NAMESPACE)
XML::Compile-E<gt>B<knownNamespace>(NAMESPACE)
=over 4
Returns the file which contains the definition of a NAMESPACE, if it
is one of the set which is distributed with the L<XML::Compile|XML::Compile>
module.
=back
$obj-E<gt>B<top>
=over 4
Returns the XML::LibXML object tree which needs to be compiled.
=back
=head2 Read XML
$obj-E<gt>B<dataToXML>(NODE|REF-XML|XML|FILENAME|KNOWN)
=over 4
Collect XML data. Either a preparsed NODE is provided, which
is returned unchanged. A ref of SCALAR is interpreted as reference
to XML as plain text (XML texts can be large, hence you can improve
performance by passing it around as reference in stead of copy).
Any value which starts with blanks followed by a "E<lt>" is interpreted
as XML text.
You may also specify a pre-defined (KNOWN) name-space. A set of definition
files is included in the distribution, and installed somewhere when the
modules got installed. Either define an environmen variable SCHEMA_LOCATION
or use L<new(schema_dirs)|XML::Compile/"Constructors"> to inform the library where to find these
files.
=back
$obj-E<gt>B<parse>(STRING)
=over 4
Extract document element tree from the STRING, which represents XML.
This is a wrapper around XML::LibXML method C<parse_string()>.
=back
$obj-E<gt>B<parseFile>(FILENAME)
=over 4
Extract document element tree from a file, specified by FILENAME.
This is a wrapper around XML::LibXML method C<parse_file()>.
=back
=head2 Filters
$obj-E<gt>B<walkTree>(NODE, CODE)
=over 4
Walks the whole tree from NODE downwards, calling the CODE reference
for each NODE found. When that routine returns false, the child
nodes will be skipped.
=back
=head1 DIAGNOSTICS
I<Error:> cannot find pre-installed name-space files
Use $ENV{SCHEMA_LOCATION} or L<new(schema_dirs)|XML::Compile/"Constructors"> to express location
of installed name-space files, which came with the L<XML::Compile|XML::Compile>
distribution package.
I<Error:> don't known how to interpret XML data
I<Error:> no XML data specified
=head1 SEE ALSO
This module is part of XML-Compile distribution version 0.13,
built on January 29, 2007. Website: F<http://perl.overmeer.net/xml-compile/>
=head1 LICENSE
Copyrights 2006-2007 by Mark Overmeer.For other contributors see ChangeLog.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.