Tag: scraping
Bulk Scraping
Published on Friday, January 6th, 2017
One of my friends who is a non-CS concentrator would like to scrape emails of all faculties listed in this website. Unfortunately, the emails are not on the page itself, but are on subpages. It would take forever to scrape the data by hand, so I helped. To do this, I need to send multiple requests to scrape each subpage. Naively, we would send a request, wait for a response, then repeat until we go over all list of faculties. This however would take a lot of time. We can do better by sending requests asynchronously. This is feasible because there is no dependency in the data.
Huginn
Published on Monday, December 26th, 2016
Yahoo! terminated Yahoo! Pipes on June 4, 2015. It breaks my heart to see another good service dying. However, I recently found another project which has an ability just like Yahoo! Pipes: Huginn
Yahoo! Pipes
Published on Sunday, May 11th, 2014
This post is migrated from my old blog.
I don’t know if RSS or Atom are still popular or not. I personally use them a lot. Here are some examples of the feeds that I subscribed.