NAME
App::Wubot::Guide::FeedFu - example fu for handling feeds
DESCRIPTION
This document gives some examples of the range of operations that can be performed on a feed.
FILTERING
I follow twitter, but I am not a fan of the tweets of the form 'I'm at some location'. So I filter those out with the following rule:
- name: ignore
condition: subject matches ^I.m at
last_rule: 1
I follow the lifehacker RSS feed, but I know in advance that I'm not going to be interested in any articles that refer to windows 8 or sports.
- name: ignore
condition: subject imatches windows 8|sports
last_rule: 1
A more complex example can be found in App::Wubot::Guide::GettingStarted.
COMBINING FEEDS
I follow a lot of RSS feeds. I often like to combine multiple feeds together into a single feed. For example, I follow the 'tekgear' and 'thinkgeek' RSS feeds, and I combine them together into a 'shop' feed.
In the configuration for both RSS feeds, I simply set the 'mailbox' to 'shop'.
- name: categorize
plugin: SetField
config:
set:
mailbox: shop
Then I point my RSS feeder to the combined feed.
http://myhost:3000/atom/shop.xml
SPLITTING AN INCOMING FEED INTO MULTIPLE OUTGOING FEEDS
I monitor my email inbox with the 'Mbox' plugin. I also get e-mails from Jira in my inbox, but I like to route those off to a 'jira' mailbox rather than to my mbox inbox. The following reactor rule would find emails that have JIRA in the subject and set the target mailbox to 'jira'.
- name: jira
condition: subject matches JIRA
plugin: SetField
config:
field: mailbox
value: jira
STRIPPING IMAGES
I hate that some RSS feeds contain advertisments or other images that are not related to the content. You can strip them out by using the ImageStrip reactor.
For example, the RSS feed from slashdot has image buttons to share an article on facebook or twitter. If I really want to do that, I'll click off to the article.
---
url: http://rss.slashdot.org/Slashdot/slashdot
delay: 1h
react:
- name: body image remover
condition: contains body
plugin: ImageStrip
config:
field: body
FETCHING THE BODY
I hate that a lot of RSS feeds have started only providing 100 or less characters of the article. This requires you to click off to the website to get the content. Using wubot, you can fetch the body and trim out the bits that are not interesting.
One example is the efoodalert RSS feed which provides information about food recalls.
---
url: http://efoodalert.wordpress.com/feed/
delay: 1h
react:
- name: get full body
condition: contains body
rules:
- name: fetch body
plugin: WebFetch
config:
field: body
url_field: link
- name: capture body contents
plugin: CaptureData
config:
field: body
regexp: '^.*(<div id="content">.*)<div class="postinfo'
FEED CONVERSIONS
Wubot can be used to transform a feed from one feed type to another. For example, data coming in from any of the following forms can be routed back out to any of the other forms:
- RSS/Atom
- mbox/maildir
- IRC
- logfiles
For example, if you wanted to receive an RSS feed and write the articles out to an mbox, you could configure a monitor for the RSS feed:
---
url: http://rss.slashdot.org/Slashdot/slashdot
delay: 1h
react:
- name: categorize
plugin: SetField
config:
set:
mailbox: news
Then a reactor rule could be used to write it back out in Maildir format:
rules:
- name: notify maildir
plugin: Maildir
condition: mailbox equals news
config:
path: /usr/home/wu/mail
mailbox: news
Now you can read your RSS feeds in mutt.