NAME

SOAP::WSDL::Parser - How SOAP::WSDL parses XML messages

Which XML message does SOAP::WSDL parse ?

Naturally, there are two kinds of XMLdocuments (or messages) SOAP::WSDL has to parse:

  • WSDL definitions

  • SOAP messages

Parser implementations

There are different parser implementations available for SOAP messages - currently there's only one for WSDL definitions.

WSDL definitions parser

  • SOAP::WSDL::SAX::WSDLHandler

    This is a SAX handler for parsing WSDL files into object trees SOAP::WSDL works with.

    It's built as a native handler for XML::LibXML, but will also work with XML::SAX::ParserFactory.

    To parse a WSDL file, use one of the following variants:

    my $parser = XML::LibXML->new();
    my $handler = SOAP::WSDL::SAX::WSDLHandler->new();
    $parser->set_handler( $handler );
    $parser->parse( $xml );
    my $data = $handler->get_data();
    
    
    my $handler = SOAP::WSDL::SAX::WSDLHandler->new({
           base => 'XML::SAX::Base'
    });
    my $parser = XML::SAX::ParserFactor->parser(
       Handler => $handler
    );
    $parser->parse( $xml );
    my $data = $handler->get_data();

SOAP messages parser

All SOAP message handler use class resolvers for finding out which class a particular XML element should be of, and type libs containing these classes.

Creating a class resolver

The easiest way for creating a class resolver is to run SOAP::WSDL's generator.

See wsdl2perl.pl

The class resolver must implement a class method "get_class", which is passed a list ref of the current element's XPath (relative to Body), split by /.

This method must return a class name appropriate for a XML element.

A class resolver package might look like this:

package ClassResolver;

my %class_list = (
   'EnqueueMessage' => 'Typelib::TEnqueueMessage',
   'EnqueueMessage/MMessage' => 'Typelib::TMessage',
   'EnqueueMessage/MMessage/MRecipientURI' => 'SOAP::WSDL::XSD::Builtin::anyURI',
   'EnqueueMessage/MMessage/MMessageContent' => 'SOAP::WSDL::XSD::Builtin::string',
);

sub new { return bless {}, 'ClassResolver' };

sub get_class {
   my $name = join('/', @{ $_[1] });
   return ($class_list{ $name }) ? $class_list{ $name }
       : warn "no class found for $name";
};
1;

Creating type lib classes

Every element must have a correspondent one in the type library.

Builtin types should be resolved as SOAP::WSDL::XSD::Builtin::* classes

Creating a type lib is easy: Just run SOAP::WSDL's generator - it will create both a typemap and the type lib classes for a WSDL file.

Sometimes it is nessecary to create type lib classes by hand - not all WSDL definitions are complete.

For writing your own lib classes, see SOAP::WSDL::XSD::Typelib::Element, SOAP::WSDL::XSD::Typelib::ComplexType and SOAP::WSDL::XSD::Typelib::SimpleType.

Parser implementations

  • SOAP::WSDL::SAX::MessageHandler

    This is a SAX handler for parsing WSDL files into object trees SOAP::WSDL works with.

    It's built as a native handler for XML::LibXML, but will also work with XML::SAX::ParserFactory.

    Can be used for parsing both streams (chunks) and documents.

    See SOAP::WSDL::SAX::MessageHandler for details.

  • SOAP::WSDL::Expat::MessageParser

    A XML::Parser::Expat based parser. This is the fastest parser for most SOAP messages and the default for SOAP::WSDL::Client.

  • SOAP::WSDL::Expat::MessageStreamParser

    A XML::Parser::ExpatNB based parser. Useful for parsing huge HTTP responses, as you don't need to keep everything in memory.

    See SOAP::WSDL::Expat::MessageStreamParser for details.

Performance

SOAP::WSDL::Expat::MessageParser is the fastest way of parsing SOAP messages into object trees and only slightly slower than converting them into hash data structures:

Parsing a SOAP message with a length of 5962 bytes:
SOAP::WSDL::Expat::MessageParser:
   3 wallclock secs ( 3.28 usr +  0.05 sys =  3.33 CPU) @ 60.08/s (n=200)
 
SOAP::WSDL::SAX::MessageHandler (with raw XML::LibXML):   
  5 wallclock secs ( 4.95 usr +  0.00 sys =  4.95 CPU) @ 40.38/s (n=200)

XML::Simple (XML::Parser):
  3 wallclock secs ( 2.36 usr +  0.03 sys =  2.39 CPU) @ 83.65/s (n=200)

XML::Simple (XML::SAX::Expat):
  7 wallclock secs ( 6.50 usr +  0.03 sys =  6.53 CPU) @ 30.62/s (n=200)

As the benchmark shows, all SOAP::WSDL parser variants are faster than XML::Simple with XML::SAX::Expat, and SOAP::WSDL::Expat::MessageParser almost reaches the performance of XML::Simple with XML::Parser as backend.

Parsing SOAP responses in chunks does not increase speed - at least not up to a response size of around 500k:

Benchmark: timing 5 iterations of SOAP::WSDL::SAX::MessageHandler, 
  SOAP::WSDL::Expat::MessageParser, SOAP::WSDL::Expat::MessageStreamParser...

SOAP::WSDL::Expat::MessageStreamParser: 
13 wallclock secs ( 7.39 usr +  0.09 sys =  7.48 CPU) @  0.67/s (n=5)

SOAP::WSDL::Expat::MessageParser: 
10 wallclock secs ( 5.81 usr +  0.06 sys =  5.88 CPU) @  0.85/s (n=5)

SOAP::WSDL::SAX::MessageHandler: 
14 wallclock secs ( 8.78 usr +  0.03 sys =  8.81 CPU) @  0.57/s (n=5)

Response size: 344330 bytes