The majority of this post was written before all the Facebook privacy concerns of May 2010 occurred, but those events make the whole issue much more relevant.I commented on a story about a user who loved AIM shortly after Buzz was launched, and, as I have in the past, I got reactions of confusion from some of my followers.
So I wanted to take a moment and outline some of the principles that guided the design of the internet, and identify some recent trends that violate those principles.
Computers and Connections Are Unreliable
When it was first designed, the internet was not a commercial enterprise. It was designed around the concept that many computers all over the country/world that could talk to one another. Because the computers are spread out geographically, in order to allow them to talk, they had to be connected by wires. This led to a situation where wires were strung out across vast distances, and an obvious concern arose that the wires could be cut or otherwise damaged. Another concern was that sometimes computers crashed, making them unavailable. This was particularly damaging in the case where computers would send messages to each other through other computers. For example, if a computer in New York wanted to send a message to a computer in San Diego, it might first hand the message to a computer in Chicago, and that computer would then hand the message to the destination in San Diego. But what if the computer in Chicago lost power? That would be a situation very much like a wire getting cut.
So, there was a realization that the network was vulnerable to attack and/or failure. This led to an attitude being adopted that we couldn’t really count on any particular machine being reachable at any given time. But, the designers still thought they could make use of the network to accomplish useful things, even with the uncertainty about how the computers were connected. To do this, they designed the services on the web to be federated.
Solution: Federated Services
A federated service is designed the same way as states are under a federal government. The United States are a collection of states, but they are joined together as a federation by a federal government. If California has a budget crisis, the damage is somewhat contained to California; North Dakota can still have a budget surplus and a healthy economy (maybe). So, the idea was to make services on the internet like that: if one service provider failed, the other service providers would be mostly unaffected.
The most successful example of a federated service is email. Many different companies, organizations and individuals run mail servers, and those servers form a federation with each other to route mail through the entire system. Yahoo might lose connectivity to the federation one day, and Yahoo’s users won’t be able to send mail anymore, but everyone else, like my mail server on
http://etherplex.org, will still work.
So, federated services avoid having a single point of failure. That means that in order to shut down email worldwide, there’s no single place that you could destroy; instead, you’d have to destroy thousands of places to make a real dent in the global email system. There are lots of internet services that implement this model, and almost all of them were designed in the early days of the internet. Internet Relay Chat (IRC), Usenet (newsgroups), email, and HTTP (web browsing!) are all federated, in that the crash of one computer in the network won’t prevent all the users of the service from accessing it. This is good! Imagine an internet where a crash on Microsoft’s web site would prevent users from accessing Amazon’s web site. Because they are independent sites, this doesn’t happen.
How do you design a federated service? Well, you get everyone to agree on how the servers should talk to one another. This is called a specification. Then, programmers can take the specification and program a computer to do what the specification says. When they connect that computer to the network, it can now be part of the federation. So, the specification for the service is openly available. Because anyone can implement it, another property of federated services is that software that allows a computer to join the federation is often open source, giving even people with limited technical knowledge and resources the ability to join the federation. There are open source web servers, IRC servers, newsgroup servers and email servers.
The Trend Away From Federation
The idea of federation gained favor before the internet was commercial. In recent years, the commercialization of the internet has fueled an interest in create reliable servers with reliable connections. This allows companies that want to make money to create services that no one else has by putting their offerings on reliable servers and not sharing the specification for them with anyone. Lots of services follow this model: Google (in its search business), Twitter, MySpace, Facebook, Yahoo, AOL instant messenger, Skype, Ventrilo, TeamSpeak, and Apple’s MobileMe are all examples. Another reason centralization is so attractive to companies is that it allows them to
compile information about their users, which provides credible backing for the internet’s biggest money-maker: advertising. The theory goes that if you know your users well, then you can advertise to them more effectively.
But it’s not just that new services tend not be federated. Companies have also found a way to “unfederate” the old services, like web sites and email. Huge email providers, like Hotmail, Yahoo Mail and GMail have so many users that the loss of any one of those services really would cripple the email infrastructure. Running a web server is fairly easy, but many online startups need to grow their business quickly, and buying computers and hiring an administrator to run hundreds of web servers is harder. So, Amazon makes their “cloud” of computers available for rent. As online startups need more computers, they just rent compute power from Amazon, which is cheaper and easier than trying to manage their own servers. The problem with this is that when Amazon’s
collection of computers goes offline, all those companies that relied on them go offline as well. So, what was once federated is now centralized.
What We’ve Lost, And How We Can Get It Back
What’s the problem with centralization? Well, it leads to systems that are prone to failure in way federated systems are not. It makes users dependent on a single company for a service they might have come to depend on. It allows one entity to collect information in one place about users. Because centralized services don’t have to play nicely with the rest of the internet, it makes it possible for users to put data into that service, but not be able to get that data back out. What does it mean to get your data “out” of a service? With email, it means being able to download it to your computer. With Twitter, it means downloading an archive of all your posts. With Facebook, you should be able to get an archive with all your notes, wall posts, photos and chat history. With Google Docs, it means I should be able to download, in bulk, my documents in a format that can be edited on my own computer. With a contact list, you should be able to export your contacts to vCard or CSV format. Some services offer this kind of data extraction, but many do not. That means that users will spend time uploading and organizing their data with a service, but if they want to take that data elsewhere, they have to start over. This gives the company a sort of “lock-in” competitive advantage, but is bad for users, since it makes it hard to choose the best service since they’ll tend to stick with the service they are already using because the switching cost is so high.
The Three Critical Questions
So, when you choose to use various services on the web, it is important to evaluate them in terms of both what it costs to join them, and what it would cost to switch away from them. To crystallize your thinking, it is useful to have some questions already outlined that help you determine whether a service is at risk of locking you in.
- Does this service operate with other services of its type?
If it is a messaging service, social network or chat service, does it allow you to chat with other providers? If not, then be aware that you’ll have to choose your service based on where your friends are. This is bad for the consumer, but good for the business: it leads to an avalanche effect in which one service tends to definitively “win”, at least for a while. See “Facebook”. Compare the situation with social networks with that of email providers.
- Can I only access the service in a web browser, or does it provide a way for other programs to interact with it?
This is related to the notion that the service provides a way for programmers to access its functionality (this is known to programmers as an “API”, or application programming interface. It is a form of openness, but by no means guarantees federation: Twitter, for example, has an API, but is not federated. Still, it’s nice to be able to update Twitter with the tool of your choice, whether it be the browser, a desktop client, or on your mobile.
- If I need (or want) to leave this service, can I get my data out of it?
If it is a service whose value lies in storing your data, how easy is it to get you data out of it, into a standard format that you could import into a new service or program? For email, usually you can extract your data over POP or IMAP and store it however you like. For social networks, this topic is still largely unexplored. For photos, there are services, like Picasa, that allow entire albums to be downloaded at full resolution. Other services, like Facebook, permanently degrade the quality of the photos uploaded, and provide no mechanism to even get those back to your computer.
AIM supports instant messaging, but so does
Jabber (which is what powers GTalk), and Jabber can interact with many chat providers. A federated version of Twitter exists called
identi.ca, which supports
OpenMicroBlogging, a standard specification for Twitter-like services (that Twitter itself doesn’t implement!) One of the reasons I support Google more often than its competitors is that they stand out as a company that supports federated services more than any other major internet player. From
Google Wave to
WebFinger, Google integrates federated services into their architecture whenever possible. When they deployed a competitor to AIM and ICQ, it was Jabber-based. When they deployed a successor to email, it was open and federated. It’s good to be on the look-out for companies that understand the network and understand that the health of the internet fundamentally relies on federated services.