Yahoo! Pipes

Tags: yahoo-pipes, rss, scraping

Published on Sunday, May 11th, 2014

This post is migrated from my old blog.

I don’t know if RSS or Atom are still popular or not. I personally use them a lot. Here are some examples of the feeds that I subscribed.

xkcd feed works very well. However, frequently I found that some entries in the Wikipedia feed (Module namespace) are not really code updates. Rather, they are documentation updates or experiments in sandboxes, indicated by their titles ending with /doc or /sandbox. Since I subscribed the feed for the entire Module namespace, these undesirable entries showed up unavoidably.

Yesterday, I thought that it would be good if there is a program that will filter the feed I subscribed. There was no program in Mac that did what I want. However, I finally found Yahoo! Pipes, a web service that solves this problem!

Here is a tutorial for Yahoo! Pipes.

Basically, what I did was to retrieve the feed, and then use regex filter to block all entries containing /doc and /sandbox at the end of their title. Finally, I exported the result as another RSS feed from Yahoo Pipes! and use this feed in my feed reader instead of the old one.

Oh, and, Yahoo! Pipes is the only reason why I am using Yahoo! Hopefully the company will not shut this service down in the same way Google usually does, because, if so, ...

Bye-bye Yahoo!

Update on December 26, 2016: Yahoo! shut down Yahoo! Pipes on June 4, 2015. What a shame!