If acceptance of a specification instills a kind of immortality, David Winer is on his way to the heavens. Winer is, among other things, the godfather of RSS, Real Simple Syndication, an XML dialect that allows people to subscribe to an information source and get notified when something new gets added. Many attempts have been made to standardize on a "publish/
The Web is now dotted with orange rectangular buttons labeled "XML" or "RSS," each one earmarking a separate syndication "feed." They include major news sources, including such English-language publications as the New York Times, Wall Street Journal, the Boston Globe, Christian Science Monitor, Time magazine and Newsweek; online news sources like Salon.
RSS feeds are mostly text, but not exclusively. A new twist for syndication is "podcasting," in which RSS audio files, predominantly MP3s, are fed to a client, to be played on a PC, Mac or MP3 player. Early users include commentators like MTV's Adam Curry "podcasting" radio-like programming. Podcasting employs an RSS 2.
Syndication civil war
If you click on one of the orange buttons, you get an XML document, whose top line usually says either or . Behind these two tags is a story of unusual acrimony, even for an evolving specification.
RSS's origin is a bit blurry, but people generally trace it back to scriptingNews, a syndication format created in 1997 by Winer, and RSS .90, which was developed in 1999 by Netscape for its user-customizable page. (Netscape's RSS stood either for "RDF Site Summary" or "Rich Site Summary.")
Winer, who founded Userland software and created one of Web's first blogs, thought RSS .90 too complex because of its use of W3C's Resource Description Framework. He moved forward with versions .91 and .92, eventually publishing the current spec--RSS 2.
The more viable contender is the Atom project, a rival XML syndication protocol, currently in version 0.
Bray has a point in that RSS 2.
"RSS is by no means a perfect format, but it is very popular and widely supported," Winer writes in the RSS 2.
Is there room for a truce between RSS 2.
RSS Reading and self-syndication
If the standard for XML syndication is becoming clear, the software supporting it has turned into a heavy competition. RSS readers (or "aggregators," as they are sometimes called) have proliferated, and which models will prevail-becoming the Google or Internet Explorer of RSS-remains to be seen. This market could fade quickly if Internet Explorer ever came out with strong RSS reading capability, as the browser is the most obvious place to read RSS feeds. But Internet Explorer's most recent upgrade in XP Service Pack 2 had no RSS support whatsoever-giving Mozilla's Firefox browser an opening.
- Here are some of the forms RSS readers now take:
-
- Extended bookmarks. As RSS readers eventually lead to some kind of Web page, integrating the reader with a browser makes intuitive sense. One way to go about this is simply to extend the bookmarks to include RSS feeds being tracked. Firefox, the open source browser from the Mozilla project, contains a rudimentary RSS reader called Live Bookmarks that does just that. Pluck, a more extensive reader, tightly integrates with Internet Explorer.
- Extended mail. Some RSS readers work within a mail program, or at least resemble one-thereby putting email, Usenet feeds, and RSS feeds under one program. The three-pane, full-screen interface of many email readers makes this a good tactic for people who want to track large amounts of information. The Norwegian browser Opera takes this approach in its built-in email facility. Mozilla's Thunderbird does the same.
- Standalone. Some readers are standalone applications. For example, Australia's Awasu runs in the background under Windows, notifying users when new material comes in, then displays the contents. You could also use Awasu on its own as your browser of choice.
- Web-based. Some RSS readers are accessible online, thereby allowing more casual users to try their hand at RSS without having to install extra software. My Yahoo, Yahoo's customizable page is the best known, and with a recent site overhaul, the service is very convenient. You can choose among popular RSS feeds or select your own, and they appear as news items on your customizable page. Ironically, My Yahoo finally fulfills the dream Netscape had when it created its version of RSS in the first place.
My Yahoo's RSS support puts the Yahoo portal ahead of Google, which has, at this writing, no syndication support whatsoever. And that has created an opportunity for search engines like Sinic8.
A Feedster search is not nearly as comprehensive as a Google search and tends to show more blogs than other classes of online information. A search for the New York Times, for example, does not actually come up with any New York Times RSS feeds. But the idea does makes sense. You can subscribe to a feed located by a Feedster query, or even subscribe to the search itself-which enables the tracking of very specific information. That idea appeals to Ben Goodger, lead developer of the Firefox browser. "I find Feedster to be a dandy aggregation engine-and their search results pages are syndicated via RSS" he wrote in his blog. Using Firefox with Feedster is an easy way to get highly customized updates: you run a search on Feedster, subscribe to the results of that search, and add the Live Bookmark to your toolbar. "Easy aggregation - doesn't get much simpler than that."
RSS publishing software and services are also growing-and syndicating your blog has become easy using services like Blogger and FeedBurner. Jim Mahar, a professor of finance at St. Bonaventure University in New York State, used FeedBurner to syndicate two blogs he keeps-one with an international following, the other for his students. Mahar began using the Internet as a way to keep former students informed of events in his field. He began with a newsletter, emailed to about 5,000 addresses-- about 800 of which bounced because of anti-spamming filters. The blog has slowly replaced the newsletter as the better medium, and RSS completes the picture by letting subscribers know when he had made updates. Mahar picked the first company that came up in Google-the FeedBurner syndication service-and got the job done in less than 40 minutes. Almost immediately, he got a spike in traffic. "This past week, I had two interviews for different radio shows who have been reading the blog for corporate finance news stories." He's convinced that syndication was key.
"Blog clog"
As RSS grows in popularity, so do fears that syndication will chew up bandwidth. That already seems to be the case for bloggers big and small. Microsoft, for example, has fed entire blog entries to participants in the Microsoft Developer Network (blogs.
Some bloggers objected. "In the blogosphere, there is hardly anything more irritating that an abbreviated RSS feed," wrote Steve Main on his blog. "The WHOLE PURPOSE of an RSS aggregator is so that I don't have to open my freaking web browser to 100 different pages. By having the content right there in my aggregator, I can skim an entire article in the time it takes to open up a new web browser. By not including full content in the RSS feed, you take away some of the productivity gains that RSS offers."
Microsoft responded by upping the limit to 1250 characters. MSDN head Sara Williams asked on her blog: "Why serve up 400k of content when we know that folks...
Her point is well taken, says Gary Lawrence Murphy, who runs a personal blog and a few websites from his home in Sauble Beach, Ontario. Murphy attracts only about 4,000 to 6,500 unique visitors a day. But after he began syndicating, he started getting notices from his carrier that his paid-for network capacity would be exhausted for the day. Murphy says that there are two ways to look at RSS. Either it delivers a notice that the content you are following has changed, leaving it to you to go to the Website; or it delivers the entire content itself. The former, he says, was the original idea of RSS. "The idea was 'microcontent'-the stories should be brief enough where you could get the idea on a cellphone or PDA," he says.
Murphy contends that many RSS readers compound the problem by not correctly implementing the conditional get command that is part of the HTTP specification. "The original idea was for proxy servers to be able to cache content locally," he says "If you send a date and the content does not agree, go and fetch the material again. Otherwise, the server just returns a 200 byte notification that you are current. "The problem was that the dates must match-and most aggregators don't consider this. They look at the field being called date, and if it's a even a few seconds different, the data gets sent again. And the time is often the local server time, rather than the client request time." Consequently, the server is constantly sending out "fresh" data, regardless of whether it is refreshed or not. Do the math, says Murphy, and even 100 blog subscribers can tax a system-each querying the data at least 24 times a day. But that's the minimum. "It's human nature to be the first to have the news, so people reset their reader to query every 10 minutes." That results in a lot of hits.
But others think that network bandwidth is so inexpensive that "blog clog" won't really happen, at least for those people serving text. "Bandwidth is getting cheaper every year," says David Winer. "I've learned as a software developer not to evaluate systems on today's deployment. You should always be thinking two or three years out." Winer says he's a strong believer in delivering enough content of the article in the feed so that a reader has a good understanding of what the full article says. Good examples include feeds from The New York Times and the BBC in which "the descriptions are written very competently, and they know that's all the reader needs. If I want more information, then I click on the link, get the story, and also get an advertisement, which pays for the feed. But with blogs, I'd prefer to see the entire text because I might be reading it on an airplane or commuting where I don't have a net connection to click on a link.
"Where you really have to worry is with podcasting, with these huge MP3 files slogging around. That gets interesting from a bandwidth standpoint: podcasts are sometimes 40MB. If you have a thousand subscribers to that, you can exhaust your allocated bandwidth in one day." Winer says that if bandwidth ever becomes a significant problem for podcasts, BitTorrent, the increasingly popular peer-to-peer file technology in which bandwidth is shared among participants, could be the solution.
Sidebar: An Interview with David Winer
David Winer has long achieved wizard status. In addition to co-authoring RSS .91 with Netscape and authoring RSS .90 and 2.
- Are you surprised by the success if RSS?
- Not really-I'm actually surprised it took so long. It's a pretty rational idea. I remember the moment in 1999 when I realized that his was going to be the way I was going to read news on the Web forever. And I'm sure a lot of other people have had that moment. RSS has automated a part of the drudgery of using the Internet-and that's how computers evolved: by automating things that human beings do. We did have to wait until there was a critical mass in terms of support from news providers. In 1997 and 1998, that certainly wasn't true. But in 1999, it started taking off with early adopters like Red Herring, Wired, Salon, and News.
com, along with lots of blogs. - The way I used to get news was to look at the sites and try to figure out what's new. You do a lot of clicking that way without finding that much new. Today, every hour, my aggregator finds hundreds of things that might be of interest to me. I could never go back.
- Is the popularity of RSS driven by news or blogs?
- What's the difference between them? We could get started on a real long discussion about that. News organizations are collections of people. And if you take one of those people and put him in his own blog, nothing really changes. I subscribe to about 300 difference feeds, and while I've never done the count, my guess is that about half are blogs.
- What was your involvement with The New York Times?
- Userland did their feed-that's how The New York Times got into RSS. I was having dinner with Martin Nisenholtz, the CEO of New York Times Digital. I wanted them to do blogs and support blogs, and I got one-half of what I was looking for. We wound up producing their feeds for the first few years.
- Was that some kind of benchmark for RSS?
- Of course: The New York Times is The New York Times. I don't like feeding that: it drives their arrogance. But I grew up in New York reading The New York Times and modeled my writing after a number of New York Times writers. But yeah, they are one of a small number of publications that can validate a concept. Maybe them more than anybody else. They've been very good at jumping on the Web in many different levels. They were one of the first publications to have a website, as well as to have RSS feeds. Now there is basically universal coverage from the top-tier publications worldwide. Among them, RSS is now more conspicuous by its absence than its presence.
- Regarding the RSS 2.
0 specification, how did you decide to at least temporarily freeze it? - There's nothing temporary about it. If there had been a cooperative process in the developer community where breakage [i.
e. breaking backwards compatibility] was considered an important issue, we could have left it unfrozen and kept going. But that was not the case. People kept arguing about whether or not we should throw the whole thing out and start over again. These weren't people with large, installed bases, they didn't have a lot at state, but they were getting listened to. And working with the publishers and bloggers, I felt they had no interest in upheaval-in changing the way it worked. - Ever since early 1999, there really hasn't been room for change. Once something like that is deployed, you can talk all you want-it isn't going to change. This has been a major misunderstanding, that somehow I have some power over whether the spec is frozen or not. But I don't have that power-it is what it is. RSS is extensible. I have yet to hear of a single thing that anyone wants to do with it through its extensibility.
- Once you get adoption as RSS has experienced, why would you want to change it? The whole idea of having XML formats is to get these large and small organizations all playing together on the same field. That's beyond anyone's dream, so why would you want to screw around with that?
- What's the legal status of the specification?
- RSS isn't owned by anybody; the format isn't copyrighted. But the specification which I wrote is copyrighted; there seemed no way to avoid that. So I put a very liberal copyright on it patterned after the IETF [Internet Engineering Task Force] copyright, that basically says that anybody can used the spec for any purpose as long as you give attribution. Creative Commons [an organization that allows the copying and distribution of work while allowing the author to retain the copyright] puts a legal stamp on that. Being at Harvard, I had access to some good lawyers that are interested in being expansive in what people can do.
- One of the things people were looking for at that time was that RSS be independent of a company and person. Harvard's a great brand name and one that had not appeared in technology until then. I thought it appropriate for RSS. The MIT "brand" was all over lots of XML stuff, which is appropriate. The idea here was to put a humanities stamp on RSS, because RSS is largely about human beings, literature, journalism and the values that Harvard stands for. What better way to say that. That's why I insisted RSS be kept simple because this is one of the tricks engineers use to keep users from having any power. By making the simple things appear to complicated, they scare users off.
- Is there anything else you'd like to say to Software Design readers?
- Ganbatte!