Scraping Tools – A quick round-up

This post was created automatically via an RSS feed and was originally published at http://10ml.com/2014/04/scraping-tools-a-quick-round-up/

I’ve written here before about using Scraperwiki to scrape content from websites which haven’t implemented OpenData.

I have even used Scraper Wiki to scrape our own website to get badly-formed content out in a structured way for a hack.

Now, sadly, Scraperwiki is no longer free, and old scrapers are mostly frozen. So if you are about to embark on a hackathon you might be looking for an alternative.

I’ve noticed four recently – but have yet to test these.

  • Morph a web-based tool from OpenAustralia. Write scrapers in Ruby, Python or PHP
  • Portia: released yesterday(!) – available via GitHub, soon to be made available as a hosted service
  • Import.IO – A free web-hosted service that promises further (maybe paid for) features. Looks like a great Help section with support, webinars etc.

Have you used any of these? Leave a comment with your experiences. Or let me know of other alternatives!

Ian

Tagged with: , ,