Brent Simmons recently wrote some remarks on algorithms picking the content you might be interested in reading from among all the content you’re subscribed to.
Facebook’s timeline algorithm has received a lot of heat in the last few months after it was revealed that the timeline was essentially a part of an experiment and users following the default “Top stories” timeline were the test subjects.
When thinking about how my feed reader should help me sift through subscribed content, I thought it would be best if it did that automatically, by determining what kind a particular feed I’m subscribing to is. Does it produce several stories/headlines a day? Or perhaps it contains mostly long form that’s published once a day at maximum? Perhaps it’s an aggregation service like Reddit or Hacker News for the times when I feel like reading random articles that may not exactly line up with my main interests.
In order to determine whether a feed is mostly an aggregation service, it’s not enough to just check the
URLs, because most feeds are pushed through third party feed proxy services that replace original post
URLs with their own. So unfortunately the device (most likely on a 3G connection) has to go through all of them and resolve them into final
HEAD requests), so they can be used to evaluate how often does the feed direct to other peoples’ content. Gruber’s blog is one example that often does, on the other hand Reddit has their “self.*” posts that don’t. In current setup, 90% of posts linking to external content classify a feed as a Link Blog.
Post frequency classification seems easier on the device though. Through a series of experiments I determined a point where in my test feed group, the feeds classified as Trickle presented good content to read every day without overburdening me. But where is that point for other users? I guess I need to enable customization of the Tricke/Firehose point in one of the coming versions.
I thought of adding “long form” to classification, but it only works for feeds that include full post content. Many don’t, because most often RSS feeds are a way to lure the reader onto the site in order to be able to sell eyeballs to advertisers. Some that do, generate income through sponsored posts that appear in the feed, so they have no problem opening up all content to it as well. This issue alone makes me want to add more customizability to the app in the form of lists set up by the user (like Twitter lists, essentially), that contain manually picked feeds. I know I probably need a few of those, since although Slashdot, Giga OM and Boing Boing are classified as Firehoses currently, it’d be nice not to have to go through the flood of other stories in order to read interesting content from those. It’s amazing, by the way, how when subscribed to several similar news sites, one can see an echo chamber form, with some stories breaking in all those news outlets in space of just a few minutes!
For now, I’m weighing my options, what makes most sense. Giving users full customization of the categories in the app did go through my mind several times. In this case those available in current versions would be the initial set, since they are a good presentation of app’s main strengths.
In related news, 1.1 of Headlines is waiting for a review, with support for big-ass iPhones and OPML import, along with other minor tweaks. Fingers crossed!