NAME
App::Rssfilter::Match::Duplicates - match an RSS item which has been seen before
VERSION
version 0.07
SYNOPSIS
use App::Rssfilter::Match::Duplicates;
use Mojo::DOM;
my $first_rss = Mojo::DOM->new( <<"End_of_RSS" );
<?xml version="1.0" encoding="UTF-8"?>
<rss>
<channel>
<item>
<link>http://rss.slashdot.org/~r/Slashdot/slashdot/~6/gu7UEWn8onK/is-typing-tiring-your-toes</link>
<description>type with toes for tighter tarsals</description>
</item>
<item>
<link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
<description>vulcan is here</description>
</item>
</channel>
</rss>
End_of_RSS
my $second_rss = Mojo::DOM->new( <<"End_of_RSS" );
<?xml version="1.0" encoding="UTF-8"?>
<rss>
<channel>
<item>
<link>http://rss.slashdot.org/~r/Slashdot/slashdot/~3/mnej39gJa9E/new-rocket-to-visit-mars-in-60-days</link>
<description>setting a new speed record</description>
</item>
<item>
<link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
<description>vulcan is here</description>
</item>
</channel>
</rss>
End_of_RSS
print "$_\n" for $first_rss->find( 'item' )->grep( \&App::Rssfilter::Match::Duplicates::match );
print "$_\n" for $second_rss->find( 'item' )->grep( \&App::Rssfilter::Match::Duplicates::match );
# or with an App::Rssfilter::Rule
use App::Rssfilter::Rule;
my $dupe_rule = App::Rssfilter::Rule->new(
condition => 'Duplicates',
action => sub { print shift->to_xml, "\n" },
);
$dupe_rule->constrain( $first_rss );
$dupe_rule->constrain( $second_rss );
# either way, prints
# <item>
# <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
# <description>vulcan is here</description>
# </item>
DESCRIPTION
This module will match RSS items if either the GUID or link of the item have been seen previously.
FUNCTIONS
match
my $item_seen_before = App::Rssfilter::Match::Duplicate::match( $item );
Returns true if $item
has a GUID or link which matches a previously-seen GUID or link. Query strings in links and GUIDs will be ignored for the purposes of matching a previous link.
SEE ALSO
AUTHOR
Daniel Holz <dgholz@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2013 by Daniel Holz.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.