NAME

App::Rssfilter::Match::Duplicates - match an RSS item which has been seen before

VERSION

version 0.07

SYNOPSIS

  use App::Rssfilter::Match::Duplicates;

  use Mojo::DOM;
  my $first_rss = Mojo::DOM->new( <<"End_of_RSS" );
<?xml version="1.0" encoding="UTF-8"?>
<rss>
<channel>
  <item>
    <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~6/gu7UEWn8onK/is-typing-tiring-your-toes</link>
    <description>type with toes for tighter tarsals</description>
  </item>
  <item>
    <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
    <description>vulcan is here</description>
  </item>
</channel>
</rss>
End_of_RSS

  my $second_rss = Mojo::DOM->new( <<"End_of_RSS" );
<?xml version="1.0" encoding="UTF-8"?>
<rss>
<channel>
  <item>
    <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~3/mnej39gJa9E/new-rocket-to-visit-mars-in-60-days</link>
    <description>setting a new speed record</description>
  </item>
  <item>
    <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
    <description>vulcan is here</description>
  </item>
</channel>
</rss>
End_of_RSS

  print "$_\n" for $first_rss->find( 'item' )->grep( \&App::Rssfilter::Match::Duplicates::match );
  print "$_\n" for $second_rss->find( 'item' )->grep( \&App::Rssfilter::Match::Duplicates::match );

  # or with an App::Rssfilter::Rule

  use App::Rssfilter::Rule;
  my $dupe_rule = App::Rssfilter::Rule->new(
      condition => 'Duplicates',
      action    => sub { print shift->to_xml, "\n" },
  );
  $dupe_rule->constrain( $first_rss );
  $dupe_rule->constrain( $second_rss );

  # either way, prints

  # <item>
  #   <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
  #   <description>vulcan is here</description>
  # </item>

DESCRIPTION

This module will match RSS items if either the GUID or link of the item have been seen previously.

FUNCTIONS

match

my $item_seen_before = App::Rssfilter::Match::Duplicate::match( $item );

Returns true if $item has a GUID or link which matches a previously-seen GUID or link. Query strings in links and GUIDs will be ignored for the purposes of matching a previous link.

SEE ALSO

AUTHOR

Daniel Holz <dgholz@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Daniel Holz.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.