KBD

Keith Devens .com

Friday, September 3, 2010 Flag waving
A language that doesn't affect the way you think about programming, is not worth knowing. – Alan Perlis

Search: 'aggregator'

Page 1 →

Daily link icon Sunday, December 18, 2005

  1. phil ringnalda » <style> in an age of HTML fragments. I love how everything is broken. And, I am surprised at how many aggregators get Atom feeds wrong.

       (0) Tags: [Programming]

Daily link icon Sunday, December 11, 2005

  1. Gregarius » A Free, Web-based Feed Aggregator. GPL, runs on your server. May try. The feed list in the default template takes up too much space.

       (0)

Daily link icon Tuesday, March 22, 2005

  1. Feedster goes in my aggregator hall of shame for continuing to fetch my old feed URL even years after it's been sending 301 redirects to the new location. Quit it!

       (8) Tags: [This website]

Daily link icon Thursday, March 10, 2005

Bloglines clipblog

I've been using Bloglines as my RSS aggregator and I'm largely happy with it. It works mostly "the right way" for an aggregator to work (i.e. not like an e-mail client), and it's great to have stuff come to me rather than me having to go to it. It's a shame everything looks the same -- I miss not seeing everyone's individual layout -- but that's the price you pay.

I've created a "clipblog" at Bloglines at http://www.bloglines.com/blog/Keith83. Previously I've had two options when I wanted to save something I saw in my aggregator, usually in order to read later because I'm not home at the time. I could either keep the item as new in Bloglines' aggregator (but that sucks, since the feed grabs your attention by claiming new items even though you've seen them all before), or I could send myself an e-mail with a bunch of URLs in it. With the latter, I often wouldn't go through the e-mail, and since it was just a list of bare URLs there was no way to tell what things were without visiting everything. The e-mails often sit in Thunderbird and stay unread. If there was a way to edit the message to delete the links I've subsequently read I'd be better off, but there's not. Also, I'd often wind up sending myself a bunch of e-mails as I came across new links I wanted to save, so it's just a mess.

In addition to those two, I'd sometimes revive my del.icio.us account, but I wouldn't use the tagging so I wasn't really using it how it was meant to be used, and it's also another account to have to log into. Plus, if I wanted to write anything I'd be limited to 255 characters. Bloglines clipblog feature doesn't have any of these problems for me. Ironically, I'm subscribed to del.icio.us/popular and I was spurred to create the clipblog when I wanted to keep track of a lot of those links.

So, feel free to visit the clipblog if you'd like. Much of the stuff there probably won't ever make it to my real blog (then again, maybe I'll just syndicate it on my blog once a day), but some of it will. Keep in mind that most of the stuff there I likely haven't read yet, and anything I read and then like will probably wind up on my blog anyway.

Hmm... I wish it automatically put the title of a weblog at the start of a link you blog straight from Bloglines instead of only the post title. And I really wish it had Javascript to automatically close the posting window after you post something.

Daily link icon Tuesday, March 8, 2005

  1. zFeeder - another RSS aggregator for me to check out.

       (0)

Daily link icon Tuesday, January 11, 2005

  1. BlogMatrix Jäger - RSS aggregator in the simple spirit of lilina that I'd like to try out. It's open source, written in Python, and uses wxWidgets (formerly wxWindows) and Cheetah. If I don't like something I can change it Smiley Open source rules.

    Update: I don't like Jäger. It actually doesn't act like lilina in that it still organizes based on feed, not based on time.

       (1)

Daily link icon Sunday, January 9, 2005

  1. lilina looks like a fairly neat aggregator (via a comment on Kellan's blog, via Alex). I don't think I like having feeds broken up into separate "folders" like most aggregators do. lilina shows all news items as one stream.

       (0)

Daily link icon Sunday, December 5, 2004

  1. Brent Simmons: Black Box Parsing: "NetNewsWire doesn’t know anything about RSS or Atom. Heck, it doesn’t know XML from Shinola. Which is an absurd claim—it wouldn’t be much of an aggregator if it were true. But—here’s the deal—to me as the developer, it is true. I'll explain..." Via Alex.

       (0) Tags: [Programming]

Daily link icon Tuesday, October 5, 2004

New feature: Comments page and rss feed

By request, there's a new feature on my weblog: I have a dedicated page just for new comments, http://keithdevens.com/weblog/comments, and an RSS feed for all new comments on all posts: http://keithdevens.com/weblog/comments/rss.

The comments HTML page will look nicer in the near future. As you can see I pretty much took exactly the HTML from the "Recent Comments" section in my right sidebar (in fact, both use the same template).

The only thing I'll warn about using my comments RSS feed is that if I get any comment spam you'll be spammed as well, and while I delete spam comments on my site, you'll be forced to delete them from your aggregator.

Let me know if you have any suggestions or comments.

Daily link icon Saturday, October 2, 2004

  1. FreePOPs:

    FreePOPs is an easily extensible program, which allows to have an access to the most varied resources through the POP3 protocol.
    Mainly, it can be used to download mail from the most famous webmails, but it could also be used as an aggregator for RSS feeds and many more. This way it is possible to get all your messages in your favourite email client.

    Programmed in Lua, and can access Gmail, Yahoo, AOL, and RSS, among other things. Neat!

       (0)

Daily link icon Saturday, August 28, 2004

  1. After Keith's recommendation of Sage, an aggregator extension for Firefox, I gotta give it a try. Unfortunately, I'm using a nightly of Firefox on which it seems installing extensions is broken.

       (0)

Daily link icon Thursday, August 26, 2004

  1. Info Aggregator - "an RSS-to-IMAP service".

       (0)

Daily link icon Monday, January 26, 2004

Integrating del.icio.us with my site: questions

I've mostly decided to integrate del.icio.us with my site. Before I do, however, I just wanted to talk out loud and ask for feedback on some things.

It's been interesting because as I've recently tended towards long, link-heavy posts I've gotten varying feedback. Some considered them "information overload", some love the links posts ( Smiley (big smile) ), and others like the link posts, but don't like the update schedule. If I added many links throughout the day the post kept constantly popping up in people's RSS aggregators because of the update. Since I got those comments I've tried to only update as few times a day as possible in batches rather than updating for individual links. On a related note, Erik, linkblogger extraordinaire, would often link to my link posts, but if I wound up initially having a link post with only a few ok links but later filling it with tons of great links Erik would pass it by. That leads me to believe that his aggregator doesn't handle updated posts the same way -- maybe if a post has been deleted in the aggregator it doesn't show up again if it's been updated. (Erik, care to comment?)

So, one of my biggest questions has to do with update times. Should I pull the links from del.icio.us only once a day, in which case I'm likely to have "stale" links by the time they wind up on my site, or maybe as many as four times a day (of course, only updating if there are new links)? The bigger problem here, however, is that del.icio.us does everything in UTC, while I'm in Eastern time. I'd like to work with it on Eastern time, so if I ask for links for "today" or "yesterday" (giving it the actual date, of course), I want it to give me "my" day, not Greenwich's day.

The other main problem is that the format is constraining. A lot of times I want to have something like:

or I want to have some descriptive text before I give the link. You can see my most recent link post for examples of things that would be impossible to do within del.icio.us's format.

Finally, it limits you to 255 (I assume) characters of "extended" text. Sure, if I have anything lengthy to say it should probably be put in its own post. But 255 is rather short, and given that I'm going to be sticking "via" URLs (among others) in my extended entries, it's even shorter.

By the way, hands down the feature I love the most about del.icio.us is the pop-up poster. Though, for some reason it works as a pop-under for me.

Daily link icon Wednesday, December 17, 2003

Feedster RSS aggregator

Feedster now has a web-based RSS aggregator you can sign up for, myFeedster.

Daily link icon Thursday, July 10, 2003

I don't like RSS readers?

Hmm... I think I just came to the conclusion that I don't like RSS aggregators.

For sites that I really like and read frequently I often just visit the home page. For sites that don't update as frequently, however, I usually use my aggregator to tell me when there's something new so I don't waste time checking if there isn't. And whenever I use an aggregator, I usually just wind up double-clicking on a feed or a post so I can view the site or the post directly, rather than reading it in my aggregator, which I've found I don't like to do.

So really, I only use my aggregator as a form of update notification. But for that purpose, it's not even very efficient because updated sites are mixed in visually with sites that haven't updated. Plus, you wind up keeping a bunch of extra posts sitting around in your aggregator, which you have to be aggressive at deleting or you wind up getting buried.

I think the update notification scheme that forums use is the most efficient possible. They tell you when there's something new, but don't bother you again until you've visited the site. That way, you can only ever get one notification for each time you visit the site. In an RSS reader, by contrast, I constantly have to look at all my feeds and feel overwhelmed by all of them, either because there are feeds there with unread items, or simply because they take up more space than can fit on my screen at once[1].

For a while I've wished that there was an efficient way to have an e-mail-based RSS aggregator, where new posts would be sent to you through e-mail. The biggest barrier to that I've seen until now was that you'd need to have an elaborate folder and filter setup to filter all of the e-mail into appropriate folders. This would have to be managed automatically or it simply wouldn't be worth it.

Though, now that I've thought about it, I really don't want to be bothered for every new post on a site. I want to be told when it's been updated, and not be bothered again until I've gone to the site.

So, my new project will be to implement this idea. It'll be done through e-mail, but it'll be manageable simply because I'll only get an e-mail for each site that updates, not each new post, which means that I only need one folder for it. The main part of this is the code that figures out whether a site has updated, and that can be done as simply as RSS aggregators can do it. The way I currently envision it is a simple script on my web site that runs with a cron job, checks every site in my list that hasn't updated since I last visited, and if the site has updated, send me an e-mail. What's nice is that if a site has updated and I haven't visited it yet it won't be checked again until after I visit it, which is a bandwidth saver better than anything an RSS reader can do.

Lastly, I'd need to visit each site through a script on my site that tracks usage, something like keithdevens.com/go?[link_id]. Unfortunately, I think that's going to send a referrer to each site every time I check it, which will be annoying. I know with Firebird I can use an extension that allows me to turn off referrers temporarily, so I'll see if that makes sense for me to use.

The only other improvements on this I can think of right now are these: If I've been sent an e-mail, but haven't checked it yet, and have visited the site after the e-mail was sent, it'd be nice if the e-mail could be automatically deleted to avoid getting duplicates. Or if I used IMAP, it could be deleted even if I've already seen it. Otherwise, instead of doing this through e-mail, I may do it as something web-based.

Footnotes:
[1]: Which, by the way, is a very important design consideration. If something can't be seen all at once, the user feels overwhelmed. Think about this in terms of your e-mail. You know when you clear out your e-mail box and all of a sudden when you only have a screenful left it doesn't seem like a lot. But add one more e-mail so you have to scroll, and you might as well have 1000 unread e-mails.

Daily link icon Saturday, June 14, 2003

Comment RSS feeds

This is a great new feature. Read this post from Sam Ruby and follow all the links. I think what this means is that for every item in an RSS feed, in aggregators like SharpReader we should soon be able to hit a plus button on the left and see a list of all comments on the post without having to leave the aggregator. And they can be checked for updates, etc. All this will be auto-discovered from the site's main RSS feed, assuming the site's feed supports this new feature. Sounds awesome.

One of these days I have to update my RSS feed to 2.0. Now that most of the rest of my CMS is together, I think I can finally devote time to this Smiley

P.S. Even though Sam's post is titled "Collaboration through namespaces", I still stand by my earlier rant on XML namespaces. This feature isn't dependent on namespaces, in that it wouldn't have been impossible without them.

Most importantly, the way this evolved was exactly how I said these types of things evolve. Someone proposes something to add to a spec, the owner of the spec agrees, and the community goes on to accept and write software around the new feature. Namespaces (as in putting a "wfw:" in front of the tag name) ultimately don't get you anything.

Daily link icon Tuesday, June 10, 2003

New version of SharpReader

I just noticed that a new version of SharpReader has been released. I haven't even been using the aggregator because I let my list of feeds get out of hand. I have to pare everything down to a level I can keep up with. I'm going through all my feeds and creating "One" and "Two" categories. "One" being for feeds I tend to keep up with. I shouldn't fall behind on those feeds. The "Two" category is for everything else -- sites I like and want to have in my aggregator, but probably produce too much content for me to keep up with Smiley.

Unfortunately, my blogroll code is only meant to handle one level of categories. So I'll have to come up with a new scheme for one or the other.

Daily link icon Sunday, April 6, 2003

More RSS aggregators!

A lot of people seem excited about Harvester. The screenshots look pretty nifty (but what the heck is up with the circle thing to not show the whole window? That's annoying...)

Check out some of his old posts too.

Next, via Matt Croydon, SharpReader looks really cool... yeah, ok, it fucking crashes on startup. It doesn't even run! Damnit.

Daily link icon Friday, April 4, 2003

Simon builds an RSS aggregator

Wow, it's really a day of RSS aggregator news, isn't it? Simon's building an RSS aggregator and discovers that...

It has been quickly becomng apparent that "Really Simple Syndication" is anything but! There are currently three major (and goodness knows how many minor) specifications doing the rounds, and the majority of feeds seem to pick and chose between the three at will.

Read the post to see lots of great examples why all this stuff is a huge pain in the butt.

He also references REX: XML Shallow Parsing with Regular Expressions, which I've never heard of before.

Sort indices instead of data

You may have noticed my new blogroll on the left. That's generated straight from my list of feeds produced by my new RSS aggregator, NewsDesk. The source OPML file for that is at /mySubscriptions.opml (that seems to be the standard place for these things).

As you can see, the OPML file isn't sorted either by category or by feed name within the category. It would be nice if it were sorted (it's sorted within the program), but whatever, life isn't perfect. You'll notice that the blogroll on my left is sorted by category, and the feeds within each category are also sorted. So, I had to write code to sort that.

Sorting this OPML file isn't as simple as sorting a list of numbers. You can do that easily with built in sort routines. To sort my blogroll, I used a trick I've used before. You sort the indices, but leave the data alone.

I use my XML parser (part of my XML-RPC library) to parse the OPML file. Anyway, I'm not sure how to explain it much more beyond just showing code:

<?php

$cms
->incLibrary('xml');

$c = &file_get_contents($_SERVER['DOCUMENT_ROOT'] . '/mySubscriptions.opml');
$b = &XML_unserialize($c);

$cat = &$b['opml']['body']['outline'];
$cat_count count_numeric_items($cat);

$categories = array();
for(
$n=0;$n<$cat_count;$n++){
    
$categories[$n] = array('text'=>$cat["$n attr"]['text'], 'index'=>$n);
}

usort($categories'blogroll_cmp');

function 
blogroll_cmp(&$a,&$b){
    return 
strnatcasecmp($a['text'],$b['text']);
}
?>

You get the idea. Then I iterate over $categories, not $cat, but index into $cat, and do basically the exact same thing for the feeds within a category, printing out HTML along the way, etc.

Anyway, I just wanted to relay the technique of sorting indices, rather than sorting the actual data, because it comes in handy in a lot of places. If anything, sometimes it can simply be more efficient because you don't have to actually move a lot of data around within the system.

RSS reader jubilation!!

I think I've finally found it. An RSS reader I like!! What a relief!

Via Bill Kearny's weblog at Syndic8.com, check out NewsDesk. I've installed it, and I'm switching to it right now!

Great features:

  • Freaking categories! Thank you! With drag and drop even!
  • It works and is fast
  • Imports and exports every format you need, OCS and OPML
  • Searching - oh man!
    • When you search, it filters all your feeds based on your search terms, and when you click on a feed it only shows you those items that match the search term.
  • It has an inline browser like Syndirella. In fact, it's based on .NET too.
  • It's got a three-paned interface like you'd expect. However, it works like I always thought an RSS aggregator should work. If you click on the feed name itself on the left, in the right two panes it shows the list of posts on top, and in the bottom pane it immediately shows a summary of all posts for that feed. Then if you click on a feed in the top list it shows that individual post.
  • It integrates heavily with NewsIsFree. Not something I need, but it's nice.
  • You can drag a url to it to add a feed. It has lots of features like that - it even supports the click-a-button-on-a-web-page-and-have-it-invoke-your-rss-reader feature like AmphetaDesk and Radio have.
  • It has lots of system tray features. Again, not something I need, but it seems to work really well.
  • It comes with an RSS to e-mail thing that can e-mail you new headlines! Freaking awesome, but I haven't tried it yet.
  • It's fast! I don't expect to have any of the horrible memory problems that I've consistently had with Syndirella.

It's just really really awesome.

Now for some nit-picky things.

  • Comes with an icky default background color. It's like a puke brown/olive color (See, this is how small my issues are) But you can change it easily.
  • I'd like a modification of the summary of all posts for the feed feature I mentioned above. I'd like it to at least have an option to show only the unread posts for that feed, rather than all of them.
  • I'd like to have unread posts show up in bold, rather than just have an "unread" marker next to a post (which also happens to be a few pixels too close to the text of the post title for my comfort).
  • When you first import a feed (from an OPML file, for instance), the "Get New Headlines" option is greyed out for that feed for some reason. I don't know if that's also the case when you add a new feed, in addition to when you import it. It takes a restart of the program to fix that.

See how minor my problems are? I'm very pleased. I'll update this if I run into any problems, but for now, I'm very happy. Watch for my Blogroll to be updated directly with the feeds from my OPML file, now that I use an aggregator that has categories.

Also make sure you check out John Abbe's great and comprehensive list of RSS readers.

Ok, one crash, while I was dragging stuff around into different categories. Though, it caught the exception and didn't seem to have a problem continuing. I reloaded my feed list just in case... update: yeah, it always crashes if you drag feeds around, and let your mouse hover over another feed so that it loads the contents of the feed into the right panes. But it's always able to recover from the exceptions. I'll have to e-mail the author...

Hey, and it doesn't leave crap bogus referrers!

Some other features I want:

  • I'd like to have it be able to use my system default browser when opening an external browser.
  • I'd like it to open the website for the feed when double clicking on a feed in the lefthand tree.
  • The parser isn't as forgiving as Syndirella's is. One of the error messages on a broken feed is the following: "Invalid byte was found at byte index 2569." It seems that you should be able to replace that byte with a space and try parsing again, repeating as necessary.
  • The list of feeds on the left seems like it's more "spaced out" than a normal explorer view is. With a lot of feeds, I'd like it if I can fit more feeds on a screen somehow. Don't know if that's possible. However, the categories and alphabetization make up for that in spades because it's much easier to check whether a feed has new posts because it is where you expect it.
  • There's a filter to show either "all headlines" or only feeds that have new posts within the last 15 minutes, hour, all the way up to 72 hours. I'd like that to have an option that just shows all feeds with any unread posts.
  • I'd like to be able to right click on a post in the top-right hand pane and get an option to copy that post's permalink to the clipboard. Syndirella has this, and it's convenient (except where the feed doesn't have permalinks to its own posts, but rather to a link in the post.

Daily link icon Wednesday, April 2, 2003

Looking for a new aggregator

Since Syndirella has been pissing me off so much (I went looking to see what's going with Dmitry, and his site is down), I went looking to see if there were any aggregators that didn't suck. So, at Brilliant Corners I came across an aggregator named Awasu.

It kind of sucks too (it keeps popping up balloons in my task bar, even though I told it not to by default, but its OPML importer used all the default default settings), it doesn't do RSS auto-discovery, it doesn't support categories and there's no way for me to reorganize my feeds, and it doesn't allow me to mark all items as "read". However, it does have a neat plugin architecture that lets you define your own data sources in Java.

So, now the question is... which sucks less? Sigh.

Update: Well, Awasu sucks, so I kept looking. I came across this page listing a bunch of aggregators, and came across Beaver, which also sucks! Though, in its defense, it's only version 0.4.3, so hopefully it'll get better.

I also came across this RSS to e-mail gateway script, which I think is a great idea. There's an aggregator that works through NNTP, but I tried that and didn't like it.

I would really like to read my feeds through my e-mail program, but to do that effectively I'd have to be able to filter my feed items into separate folders based on the feed. Of course I could do that with a similar gateway script and setting up a filter for every single feed, but there's no way I'm doing that. If only my e-mail program had some kind of scriptable interface, but noooo.

Why does everything suck? That's it, I want a Mac, they have better software.

Daily link icon Saturday, March 29, 2003

I want a better aggregator!

Not to be disparaging, but it seems like Dmitry has slowed development on Syndirella. As I add more feeds, it's getting harder and harder to find things... I could really use some categories Smiley

Daily link icon Monday, March 24, 2003

Blogroll

I'm finally getting around to setting up a blogroll over on the left there. (Only on /weblog, not at the root... easy to do with my CMS without any weird conditionals (hah))

It's in progress. I'm going through my aggregator and seeing who I read regularly. Not done yet. Going to eat dinner.

Daily link icon Wednesday, March 5, 2003

RSS aggregator mock-up in Python

Via Bitworking, I gotta check out Sam Ruby's RSS aggregator mock-up. At the very least it'll illustrate the right way to use wxPython Smiley

Daily link icon Saturday, February 15, 2003

Losing my mind

As well as other things... a little while ago I drew out a mock-up of what my web-based RSS aggregator would look like, should I choose to write one. Where did it go?

I remember the basic look of it, but mainly I'm trying to remember what was on all my tabs at the top of the screen...

Daily link icon Saturday, February 8, 2003

RSS aggregator in Python

Given my recent problems with Syndirella[1], I'm looking to potentially write my own RSS aggregator in Python.

I have some interface ideas, as well as some backend ideas. To make an aggregator I'd want to use, it'd need to use remote scripting as well as threading. Here's an article from IBM developerWorks on remote scripting, and here's a page from Microsoft on RS. I'm not really sure what support Python has for threading yet, so I'm doing my research.

Ok, Python has good enough threading support Smiley The Python Essential Reference is awesome.

Footnotes:
[1]: disclaimer: I realize that Syndirella is an alpha quality product right now. For this point in its development it's actually very good, and it's only going to get better. I'll very likely never finish what I'm talking about in this post. Either way, I figured this research would be useful...

Daily link icon Wednesday, February 5, 2003

Syndirella ate my feeds!!!

AAAAaaaaaaaaaaahhh... Syndirella just ate all my feeds! Boooo. That'll teach me to download new builds all the time. Maybe I'll start building my feeds from scratch? I could always export them again from my old aggregator, but maybe I'll start fresh...

Ok, this is the second time I've tried to add an RSS feed by giving Syndirella the actual location of the feed, and it claims it can't find the feed. Aaaah. SOB.

Man, I wish I had a Mac so I could use NetNewsWire.

Maybe I can use nntp//rss?

Quickies

Via WHEDONesque, a great commentary on the most recent episode of Buffy. I still want to know how Giles avoided the axe that was an inch from his neck. I was also confused when I saw Amy, she seemed to take so many hints out of "The First"'s book that I thought she was the first at first Smiley

Scott: Understanding The Owl Document Management Permissioning Model

Weblogs; Usenet Done Over?

I was reading discussions of news aggregators and RSS and lack of presentation in aggregators and webpage scrapers and trackbacks come-tos and on and on and it suddenly occured to me:

We've managed to re-invent UseNet, only instead of topic-oriented newsgroups, we have a system where each newsgroup is an individual's (or a small group's) personal playground. Only it's been done poorly.

That's a whole lot of effort to get to the same place :-)

Also, does anyone know where Dave (the Pragmatic Programmer)'s weblog is? I came across it yesterday, but now I'm not sure where I saw it.

Daily link icon Thursday, January 30, 2003

Stick to the user-agent, or RSS readers misuse the referrer header

Referral Abuse:

It would be nice if there was some sort of browser header the aggregator could send to identify itself instead of using the referrer field. Oh, that's right, there is. It's called User-Agent.

I've covered this before. I feel evil that I'm now using an aggregator that does just this:

Some aggregators have taken things a step further by allowing the user to use any arbitrary URL as the referrer.

Bad.

So I get 48 "referrals" each day from www.hardhathosting.com even though there's not a single link from their site to mine.

Me too! Finally, Jason has a great argument:

Each time I load a page in Internet Explorer, I don't leave a referer for www.microsoft.com/ie in the log files of the site whose page I loaded, so why should any of the RSS readers be different?

Plus, he has an actual reference to the HTTP RFC where they declare these types of abuses to be illegal.

Page 1 →
September 2010
SunMonTueWedThuFriSat
 1234
567891011
12131415161718
19202122232425
2627282930 



RSS feed RSS feed for Keith's Weblog
Atom feed Atom feed for Keith's Weblog
Weblog archive
Recent comments
  on 2 posts

Recent comments XML

new⇒Call a function from a string in Python

or​use:
?!code:language=python
def​fce(arg1=None,arg2=None):
#some​usefu...

Richard: Sep 2, 1:08pm

new⇒Spider solitaire

Been playing 4S Spider for a couple​of years. Only recently did I start​to ...

jimibd: Sep 2, 3:16am

Generated in about 0.159s.

(Used 6 db queries)