NAME
XML::RSS::Tools - Perl extension for very high level RSS Feed manipulation
SYNOPSIS
use XML::RSS::Tools;
my $rss_feed = XML::RSS::Tools->new;
$rss_feed->rss_uri('http:://foo/bar.rdf');
$rss_feed->xsl_file('/my/rss_transformation.xsl');
$rss_feed->transform;
print $rss_feed->as_string;
DESCRIPTION
RSS/RDF feeds are commonly available ways of distributing the latest news about a given web site for news syndication. This module provides a VERY high level way of manipulating them. You can easily use LWP, the XML::RSS and XML::LibXSLT do to this yourself.
When working with XML if the file is invalid for some reason this module will craok bringing your application down. When calling methods that deal with XML manipulation you should enclose them in an eval statemanet should you wish your program to fail gracefully.
Otherwise method calls will return true on success, and false on failure. For example after loading a URI via HTTP, you may wish to check the error status before proceeding with your code:
unless ($rss_feed->rss_uri('http://this.goes.nowhere/')) {
print "Unable to obtain file via HTTP", $rss_feed->as_string(error);
# Do what else
# you have to.
} else {
# carry on...
}
CONSTRUCTOR
my $rss_object = XML::RSS::Tools->new;
Or with optional parameters.
my $rss_object = XML::RSS::Tools->new(
version => 0.91,
auto_wash => 1,
debug => 1);
METHODS
Source RSS feed
$rss_object->rss_file('/my/file.rss');
or
$rss_object->rss_uri('http://my.server.com/index.rss');
or
$rss_object->rss_string($xml_file);
All return true on success, false on failure. If an XML file was provided but was invalid XML the parser will fail fataly at this time. The input RSS feed will automatically be normalised to the prefered RSS version at this time. Chose your version before you load it.
Source XSL-Template
$rss_object->xsl_file('/my/file.xsl');
or
$rss_object->xsl_uri('http://my.server.com/index.xsl');
or
$rss_object->xsl_string($xml_file);
All return true on success, false on failure. The XSL-T file is not parsed or verified at this time.
Other Methods
$rss_object->as_string;
Returns the RSS file after it's been though the XSL-T process. Optionally you can pass this method one additional parameter to obtain the source RSS, XSL Tempate and any error message:
$rss_object->as_string(xsl);
$rss_object->as_string(rss);
$rss_object->as_string(error);
If there is nothing to stringify you will get nothing.
$rss_object->debug(1);
A simple switch that control the debug status of the module. By default debug is off. Returns the current status.
$rss_object->get_auto_wash;
$rss_object->get_version;
and
$rss_object->set_auto_wash(1);
$rss_object->set_version(0.92);
These methods control the core RSS functionality. The get methods return the current setting, and set method sets the value. By default RSS version is set to 0.91, and auto_wash to true. All incoming RSS feeds are automatically converted to one RSS version. If auto_wash is true, then all RSS files are cleaned before RSS normaisation to replace known entities by their numeric value, and fix know invalid XML constructs.
PREREQUISITES
To function you must have at least XML::RSS
installed, and to be of any real use XML::LibXSLT
and XML::LibXML
.
Either HTTP::GHTTP
or LWP
will bring this module to full functionality. HTTP::GHTTP is much faster than LWP but not as widely available.
Any OS able to run the core requirments.
EXPORT
None by default.
HISTORY
0.06 Changes to HTML Documentation. Tests fixed.
0.05 More minor stuff. Change to entities routine - still not ideal. Test suite upgraded and expanded again.
0.04 Removed un-used test files, other minor changes. Defect in Test script corrected, tested module on Linux.
0.03 Minor code changes and defect corrections. Example script included.
0.02 Some code changes, POD expanded, and test suite more developed.
0.01 Initial Build. Shown to the public on PerlMonks May 2002, for feedback.
See Chnages file for more detail
ToDo
This module needs expanded testing, and beta testing in the wild. It also needs the ability to accept rss/xsl files directly from file handles.
The URI handler needs to redirect "file:" requests to the file processor rather than the HTTP tool. In theory I could remove the xxx_file method all together if we treat all files as URIs.
Provide xmlcatalog exmaple so the the manual removal of DTDs can be taken out.
Possibly re-write XML::RSS based on the LibXML parser, or fix it's output on the older XML::Parser core. Try and develop a more effective XML stream pre-parsing auto-wash function...
Defects and Limitations
If an RSS or XSL-T file is passed into LibXML and it contains references to external files, such as a DTD or external entites, LibXML will automatically attempt to obtain the files, before performing the transformation. If the files refered to are on the public INTERNET, and you do not have a connection when this happens you may find that the process waits around for several minutes until LibXML gives up. If you plan to use this module in an asyncronous manner, you should setup an XML Catalog for LibXML using the GNOME xmlcatalog command. See: http://www.xmlsoft.org/catalog.html for more details.
Many commercial RSS feeds are derived from the Content Managment System in use at the site. Often the RSS feed is either not well formed or it is invalid. In either case this will prevent the RSS parser from functioning, and you will get no output. The auto_wash option attempts to fix these errors, but it's is neither perfect nor ideal. Some people report good succes with complaining to the site.
XML::RSS on which this module uses for RSS normalisation has a defect in that in does not escape & " ' < > in it's output stream, resulting in invalid XML. Again the auto_wash option attempts to correct this, but again, the correction is not reliable....
Perl pre 5.7.x is not able to handle Unicode fully, strange things happen... Things should get better as 5.8.0 is now available.
AUTHOR
Adam Trickett, <atrickett@cpan.org>
This module contains the direct and indirect input of a number of Perlmonks: Ovid, Matts and more...
SEE ALSO
perl
, XML::RSS
, XML::LibXSLT
, XML::LibXML
, LWP
and HTTP::GHTTP
.
COPYRIGHT
XML::RSS::Tools, Copyright iredale Consulting 2002
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111, USA.