NAME

XML::RSS:Parser - A liberal object-oriented parser for RSS feeds.

SYNOPSIS

#!/usr/bin/perl -w

use strict;
use XML::RSS::Parser;
use URI;
use LWP::UserAgent;
use Data::Dumper;

my $ua = LWP::UserAgent->new;
$ua->agent('XML::RSS::Parser Test Script');
my @places=( 'http://www.timaoutloud.org/xml/index.rdf' );

my $p = new XML::RSS::Parser;

foreach my $place ( @places ) {

	# retreive feed
	my $url=URI->new($place);
	my $req=HTTP::Request->new;
	$req->method('GET');
	$req->uri($url);
	my $feed = $p->parse($ua->request($req)->content);
	
	# output some values
	my $title = XML::RSS::Parser->ns_qualify('title',$feed->rss_namespace_uri);
	print $feed->channel->type.": ".$feed->channel->element($title)->value."\n";
	print "item count: ".$feed->item_count()."\n";
	foreach my $i ( @{ $feed->items } ) {
		foreach ( keys %{ $i->element } ) {
			print $_.": ".$i->element($_)->value."\n";
		}
		print "\n";
	}
	
	# data dump of the feed to screen.
	my $d = Data::Dumper->new([ $feed ]);
	print $d->Dump."\n\n";

}

DESCRIPTION

XML::RSS::Parser is a lightweight liberal parser of RSS feeds that is derived from the XML::Parser::LP module the I developed for mt-rssfeed -- a MovableType plugin. This parser is "liberal" in that it does not demand compliance to a specific RSS version and will attempt to gracefully handle tags it does not expect or understand. The parser's only requirements is that the file is well-formed XML and remotely resembles RSS. The module is leaner then XML::RSS -- the majority of code was for generating RSS files.

Your feedback and suggestions are greatly appreciated. See the TO DO section for some brief thoughts on next steps.

This modules requires the XML::Parser package.

METHODS

The following methods are available:

  • new

    Constructor for XML::RSS::Parser. Returns a reference to a XML::RSS::Parser object.

  • parse(source)

    Inherited from XML::Parser, the SOURCE parameter should either an open IO::Handle or a string containing the whole XML document. A die call is thrown if a parse error occurs otherwise it will return a XML::RSS::Parser::Feed object.

  • parsefile(file)

    Inherited from XML::Parser, FILE is an open handle. The file is closed no matter how parse returns. A die call is thrown if a parse error occurs otherwise it will return a XML::RSS::Parser::Feed object.

  • ns_qualify(element, namesapce_uri)

    An simple utility method implemented as an abstract method that will return a fully namespace qualified string for the supplied element.

XML::RSS::Parser::Feed

XML::RSS::Parser::Feed is a simple object that holds the results of a parsed RSS feed.

  • new

    Constructor for XML::RSS::Parser::Feed. Returns a reference to a XML::RSS::Parser::Feed object.

  • rss_namespace_uri

    A utility method for determining the namespace RSS elements are in if at all. This is important since different RSS namespaces are in use. Returns the default namespace if it is defined otherwise it hunts for it based on a list of common namespace URIs. Return a null string if a namespace cannot be determined or was not defined at all in the feed.

  • channel([XML::RSS::Parser::Block])

    Gets/sets a XML::RSS::Parser::Block object assumed to be of type channel.

  • items([XML::RSS::Parser::Block])

    Gets/Sets an ARRAY reference of XML::RSS::Parser::Block objects assumed to be of type item.

  • item_count

    Returns an integer representing the number of items in the feed object.

  • image([XML::RSS::Parser::Block])

    Gets/Sets a XML::RSS::Parser::Block object assumed to be of type image.

  • append_item(XML::RSS::Parser::Block)

    Appends a XML::RSS::Parser::Block assumed to be of type item to the feed's array of items.

XML::RSS::Parser::Block

XML::RSS::Parser::Block is an object that holds the contents of a RSS block. Block objects can be of type channel, item or image. Block objects maintain a stack and a mapping of objects to their namespace qualified element names.

  • new([$type, \%attributes])

    Constructor for XML::RSS::ParserBlock. Optionally can specify the type of the block via a SCALAR in addition to any attributes via a HASH reference. Returns a reference to a XML::RSS::Parser::Block object.

  • append(XML::RSS::Parser::Element)

    Appends a XML::RSS::Parser::Element object to the block stack and element mapping.

  • attributes([\%attributes])

    Gets/Sets a reference to a HASH containing the attributes for the block.

  • element([$nsq_element_name])

    The element method is similar to CGI->param method. If the method is called with a SCALAR representing a namespace qualified element name it will return all of the XML::RSS::Parser::Element objects of that name in an ARRAY context. If called in with a namespace qualified element name in s SCALAR context it will return the first XML::RSS::Parser::Element object. If the method is called without a parameter a HASH reference. This HASH reference in a mapping of namespace qualified element names as keys and a reference to an ARRAY of 1 or more cooresponding Element objects.

  • stack

    Returns an ARRAY of XML::RSS::Parser::Element objects representing the processing stack.

  • type([$type])

    Gets/Sets the type of block via a SCALAR. Assumed to be either channel, item, or image.

  • is_type($type)

    Test whether the object is of a certain type. Returns a boolean value.

XML::RSS::Parser::Element

XML::RSS::Parser::Element is an object that represents one tag or tagset in an RSS block.

  • new([$type, $name, $value, \%attributes])

    Constructor for XML::RSS::ParserBlock. Optionally can specify the type of the block, namespace qualified element name and value via SCALARs in addition to any attributes via a HASH reference. Returns a reference to a XML::RSS::Parser::Element object.

  • attributes([\%attributes])

    Gets/Sets a reference to a HASH containing the attributes for the block.

  • name([$nsq_element_name])

    Gets/Sets the namespace qualified element name via a SCALAR.

  • type([$type])

    Gets/Sets the type of block via a SCALAR. Assumed to be either channel, item, or image.

  • is_type($type)

    Test whether the object is of a certain type. Returns a boolean value.

  • value([$value])

    Gets/Sets the value of the element via a SCALAR.

  • append_value($value)

    Appends the value of the passed parameter to the object current value.

DEPENDENCIES

XML::Parser

SEE ALSO

XML::Parser, http://feeds.archive.org/validator/, http://www.xml.com/pub/a/2002/12/18/dive-into-xml.html, http://www.oreillynet.com/pub/a/webservices/2002/11/19/rssfeedquality.html,

TO DO AND ISSUES

  • XHTML and FOAF content handling.

  • Abstraction layer for handling overlapping elements found throughout the various RSS formats.

LICENSE

The software is released under the Artistic License. The terms of the Artistic License are described at http://www.perl.com/language/misc/Artistic.html.

AUTHOR & COPYRIGHT

Except where otherwise noted, XML::RSS::Parser is Copyright 2003, Timothy Appnel, self@timaoutloud.org. All rights reserved.