Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings Skip to content Infomotions Mini-Musings Artist- and Librarian-At-Large Menu and widgets Recent posts Charting & graphing with Tableau Public Extracting parts-of-speech and named entities with Stanford tools Creating a plain text version of a corpus with Tika Identifying themes and clustering documents using MALLET Introduction to the NLTK Categories Alex Catalogue (14) Hacks (52) Librarianship (37) Miscellaneous (20) Reviews (16) Travelogues (22) Uncategorized (39) Tags "next generation" library catalogs ALA Alex Catalogue archiving bigrams book-binding c4l11 Code4Lib concordance dh2010 digital humanities Digital Library Federation (DLF) DPLA (Digital Public Library of America) Encoded Archival Description (EAD) ePub Google Books Google Onebox great books HyperCard hypertext indexing Internet Archive LDAP Lucene MARC metadata n-grams NGC4Lib OAI-PMH Open Library open source software Perl preservation services against texts Steve Cisler term frequency/inverse document frequency (TFIDF) text mining Top Technology Trends University of Michigan VUFind water WebService::Solr widgets word cloud XSLT Search for: Archives Archives Select Month May 2018  (1) April 2018  (8) March 2018  (1) August 2017  (1) June 2016  (1) May 2016  (2) April 2016  (2) March 2016  (2) January 2016  (2) November 2015  (2) October 2015  (3) September 2015  (1) June 2015  (1) April 2015  (1) January 2015  (1) October 2014  (1) August 2014  (1) July 2014  (1) June 2014  (1) February 2014  (1) December 2013  (1) November 2013  (2) December 2011  (1) September 2011  (1) August 2011  (1) July 2011  (2) June 2011  (2) May 2011  (2) April 2011  (1) March 2011  (4) February 2011  (1) January 2011  (1) December 2010  (5) November 2010  (2) October 2010  (2) September 2010  (3) August 2010  (6) July 2010  (2) June 2010  (8) May 2010  (3) April 2010  (1) March 2010  (4) January 2010  (2) December 2009  (3) October 2009  (2) September 2009  (3) August 2009  (6) July 2009  (4) June 2009  (5) May 2009  (2) April 2009  (7) March 2009  (2) February 2009  (4) January 2009  (4) December 2008  (6) November 2008  (2) September 2008  (3) August 2008  (4) July 2008  (9) June 2008  (6) May 2008  (4) About this blog Blogroll CRRA Blog Days in the Life of a Librarian Digital Humanities @ Notre Dame Linked Data and Archives Musings on Librarianship Readings Fun with RSS and the RSS aggregator called Planet This posting outlines how I refined a number of my RSS feeds and then aggregated them into a coherent whole using Planet. Many different RSS feeds I have, more or less, been creating RSS (Real Simple Syndication) feeds since 2002. My first foray was not really with RSS but rather with RDF. At that time the functions of RSS and RDF were blurred. In any event, I used RDF as a way of syndicating randomly selected items from my water collection. I never really pushed the RDF, and nothing really became of it. See “Collecting water and putting it on the Web” for details. In December of 2004 I started marking up my articles, presentations, and travelogues in TEI and saving the result in a database. The webified version of these efforts was something called Musings on Information and Librarianship. I described the database supporting the process is a specific entry called “My personal TEI publishing system“. A program — make-rss.pl — was used to make the feed. Since then blogs have become popular, and almost by definition, blogs support RSS in a really big way. My RSS was functional, but by comparison, everybody else’s was exceptional. For many reasons I started drifting away from my personal publishing system in 2008 and started moving towards WordPress. This manifested itself in this blog — Mini-Musings. To make things more complicated, I started blogging on other sites for specific purposes. About a year ago I started blogging for the “Catholic Portal”, and more recently I’ve been blogging about research data management/curation — Days in the Life of a Librarian — at the University of Notre Dame. In September of 2009 I started implementing a reading list application. Print an article. Read it. Draw and scribble on it. (Read, “Annotate it.”) Scan it. Convert it into a PDF document. Do OCR against it. Save the result to a Web-accessible file system. Do data entry against a database to describe it. Index the metadata and extracted OCR. And finally, provide a searchable/browsable interface to the whole lot. The result is a fledgling system I call “What’s Eric Reading?” Since I wanted to share my wealth (after all, I am a librarian) I created an RSS feed against this system too. I was on a roll. I went back to my water collection and created a full-fledged RSS feed against it as well. See the simple Perl script — water2rss.pl — to see how easy it is. Ack! I now have six different active RSS feeds, not counting the feeds I can get from Flickr and YouTube: Catholic Portal Life of a Librarian Mini-musings Musings What’s Eric Reading? Water collection That’s too many, even for an ego surfer like myself. What to do? How can I consolidate these things? How can I present my writings in a single interface? How can I make it easy to syndicate all of this content in a standards-compliant way? Planet The answer to my questions is/was Planet — “an awesome ‘river of news’ feed reader. It downloads news feeds published by web sites and aggregates their content together into a single combined feed, latest news first.” A couple of years ago the Code4Lib community created an RSS “planet” called Planet Code4Lib — “Blogs and feeds of interest to the Code4Lib community, aggregated.” I think it is maintained by Jonathan Rochkind, but I’m not sure. It is pretty nice since it brings together the RSS feeds from quite a number of library “hackers”. Similarly, there is another planet called Planet Cataloging which does the same thing for library cataloging feeds. This one is maintained by Jennifer W. Baxmeyer and Kevin S. Clarke. The combined planets work very well together, except when individual blogs are in both aggregations. When this happens I end up reading the same blog postings twice. Not a big deal. You get what you pay for. After a tiny bit of investigation, I decided to use Planet to aggregate and serve my RSS feeds. Installation and configuration was trivial. Download and unpack the distribution. Select an HTML template. Edit a configuration file denoting the location of RSS feeds and where the output will be saved. Run the program. Tweak the template. Repeat until satisfied. Run the program on a regular basis, preferably via cron. Done. My result is called Planet Eric Lease Morgan. The graphic design may not be extraordinarily beautiful, but the content is not necessarily intended to be read via an HTML page. Instead the content is intended to be read from inside one’s favorite RSS reader. Planet not only aggregates content but syndicates it too. Very, very nice. What I learned I learned a number of things from this process. First I learned that standards evolve. “Duh!” Second, my understanding of open source software and its benefits was re-enforced. I would not have been able to do nearly as much if it weren’t for open source software. Third, the process provided me with a means to reflect on the processes of librarianship. My particular processes for syndicating content needed to evolve in order to remain relevant. I had to go back and modify a number of my programs in order for everything to work correctly and validate. The library profession seemingly hates to do this. We have a mindset of “Mark it and park it.” We have a mindset of “I only want to touch book or record once.” In the current environment, this is not healthy. Change is more the norm than not. The profession needs to embrace change, but then again, all institutions, almost by definition, abhor change. What’s a person to do? Forth, the process enabled me to come up with a new quip. The written word read transcends both space and time. Fun!? Finally, here’s an idea for the progressive librarians in the crowd. Use the Planet software to aggregate RSS fitting your library’s collection development policy. Programatically loop through the resulting links to copy/mirror the remote content locally. Curate the resulting collection. Index it. Integrate the subcollection and index into your wider collection of books, jourals, etc. Repeat. Published by Eric Lease Morgan Artist- and Librarian-At-Large View all posts by Eric Lease Morgan Posted on May 25, 2011Author Eric Lease MorganCategories Hacks, ReviewsTags Planet, RSS Post navigation Previous Previous post: Book reviews for Web app development Next Next post: Next-generation library catalogs, or ‘Are we there yet?’ Proudly powered by WordPress