LISZEN: A search engine for library/ian blogs

By | 4 de Novembro de 2006

There was a comment on this excellent service’s blog about having a single RSS feed for all 500 blogs!!!

I’ve almost done it in the past. In fact it was the point of origin for my own http://www.infolitworldnews.comm – where Google CoOp is just the search box, an all the rest is feed aggregation.

I’ve built this on top of previous ingenuity developed on mini “BlogLines-Like” sites for an association I’m a board member of:

The problem is… 500 blogs at 2 posts a day is 10000 posts a day , which is 416 posts an hour, meaning 6 posts a minute… is anyone capable of digesting every single thought a librarian exudes to his/her blog? However it can be done.

Using my own knowledge about feedonfeeds (the engine behind Info Lit World News) it would take forever to harvest the 500 blogs (15 seconds a blog: 5000 seconds per cycle, 83 minutes… hum… seems feasible, but we would never have the blog posts up to the minute ).

To put an feed instead of a site as the delivery mechanism is a piece of cake. If I have time I’ll do it this weekend.

11:00 OK. It’s going well… but 68 feeds, as a test produced 980 posts cached…. does this thing scale without needing to sell the service to BlogLines or Thechnoratti?

After all this would be a piece of cake if bloglines outputes my subscriptions as a single feed!

 It’s now 17:00 and I’ve been working this problem since 9:00

There is something wrong with the idea: I have half the blogs loaded and I have 60 posts in the previous hour!!! This feed will be impossoble to follow!!!

15:32, the next day: