Latest Publications

Thoughts on Google Nexus One

I Love Gadgets

Anyone who knows me can tell you that I'm that "gadget guy". I have tons of little devices of all sorts lying around, from digital audio players to full fledged portable media players. Some I thought would be wonderful but never quite worked out as well as I hoped (Cowon O2), while others should have been a waste of money, but I ended up using a ton and loving (PSP Go).

As such, it makes sense that I have bought and used a fair number of cell phones. In the past year, I have tracked Android's development quite closely, having never really been satisfied with the offerings of the elephant-in-the-room iPhone. I bought a "dev phone" Google G1 in December 2008, and in August 2009 decided to switch to TMobile and get a "MyTouch 3G" (HTC Magic), since TMobile seemed to have embraced Android more than other carriers. Despite the having just purchased the MyTouch, I was compelled to buy a Nexus One the day it was released. But, before I get to that, let me digress a moment into the problems Android solves.

The Problem With The iPhone

As I alluded to above, I'm less that pleased with the iPhone. Why, you may wonder, would I be dissatisfied with the phone that has obvious enchanted so many? The reasons are few and simple.

First, it is a closed platform, insofar as Apple has final (opaque) say over what applications can run on it. I don't tolerate that in my computers, and there's no reason I should tolerate it in my cell phones. How does this affect me? Well, I use Google Voice, for example, and Apple has consistently worked to block Google Voice from ever making its way to the iPhone. There are lots of examples like this. In the end of 2008, Apple blocked a podcasting application because it "duplicated the functionality of iTunes", when it was clear to everyone involved that the app added quite a bit functionality absent from iTunes. Most recently it has come to light that Apple will outright reject any app that mentions Android. They are also now forbidding apps that use the iPhone's location service to serve geographically relevant ads. The point is, Apple has declared that they own not only all the apps that will ever appear an the device, but that the reserve the right to prevent any app from ever reaching any device, even for reasons that were previously unannounced. As a developer, I find this oppressive. It can cost tens of thousands of dollars (or more!) to develop an app and submit it, and only then do you find that, despite your best efforts, Apple rejects it, resulting in a total loss for the developers.

What else does a closed platform prevent? Well, Free Software, of course. Because Apple charges an end-user $100 to allow them to compile code for their phone and install it, it effectively prevents regular users from getting the source to an app, compiling it, and installing it for their own use. This restriction is in addition to the distribution restriction outlined above.

At this point, if you're still reading, I'm sure you're thinking "Yeah, I get it Rick. It's closed…but that doesn't really affect me that much. What's the real problem?"

Well, the second problem is that Apple prioritized battery life over utility. It's a straight trade-off, and in the end, it's a judgement call. I just think Apple made the wrong decision. The iPhone doesn't multitask. That means it literally can't do two things at once. And once you've had a phone that does, you can't imagine life without it. Which leads me neatly into my review of the Nexus One.

The Virtues of Android

For a device so obviously inspired by the iPhone, Google made so many decisions differently, it's kind of surprising the Nexus One appears to be so similar to the iPhone.

Whereas the iPhone OS is closed, Android is open, and used by dozens of companies around the globe.

Android's market allows apps openly, and even if you can't distribute your app through the market for some reason, you can still host it on your own site and have users install it directly. Kind of like how computers work.

Android has multitasking. This means you can stream Internet radio in your car while having your phone navigate you to your destination. The phone will pause the music to speak the directions to you, and then resume where it left off. It will notify you of incoming email or instant messages all the while, and update weather and news in the background to make sure that when you do glance at it, everything is up-to-date.

The Nexus One

But so far, all I've mentioned is the benefits of Android. Why is the Nexus One special?

Polish. The phone is a pleasure to use. The screen is bright and large, with high resolution (800×480). The chip powering the phone is the fasted phone chip in the world, running at 1 GHz. This means that where previous generations of Android ran sluggishly on their hardware, the Nexus One is as smooth as butter.

The Nexus One makes use of that power. It integrates 3D in subtle and stylish ways (like in the gallery application). It offers "live wallpapers" that add a bit of shine to the user experience that makes you feel like you're holding a special piece of technology. The speed of everything makes the phone extremely usable; you don't end up waiting for the phone to do something. This makes the apps much more useful, so much so that I often just check email, Twitter and Facebook on my phone, rather that messing with my laptop.

It also integrates voice recognition in an unprecedented way. Anything from "Directions to In-N-Out" to "Call Dad" are understood clearly, even in noisy situations. If an explicit command is not given, a Google search is executed. But other than just launching activities, voice is integrated into the keyboard. Got a text or instant message? Instead of typing my response, I usually just say it. That saves time, and allows me to respond to short texts quickly as I walk into the store, rather than stopping and taking the time to type them out.

Recommended Apps?

So, in my 13+ months of Android use, what do I use? Seesmic for Twitter access. Facebook to check what's up with my friends on the go. NewsRob is incredible for reading Google Reader on the go. I use Advanced Task Manager to manage the processes on the phone (remember, it multitasks!). Aldiko is a superb little ebook reader application that offers seamless download of classic books. I use TuneWiki for music, and I've even bought music using the AmazonMP3 app and played it immediately. I like QR Codes, so I use ZXing's Barcode Scanner more frequently that you'd imagine.

In the realm of games, I have to hand it to Vexed Pro, a version of my favorite puzzle game of all time I used to play on the Palm Pilot. I also play CrystalLight Defense, Robo Defense, Radiant and Jewellust, all of which are well worth the couple of bucks I paid for them.

Concluding Thoughts

To be really successful, a gadget has to have great combination of hardware and software. The Nexus One really shines in the hardware department, and brings a new level of polish to the Android operating system. As though the default software experience using Android 2.1 weren't enough, the raft of Android phones hitting the market (including the Verizon Motorola Droid) has really jump started the app development for Android, and the quality of the apps overall is really incredible. Combined with the open nature of both the market and the platform, Google has a real winner on their hands. Despite the expected slow sales partially due to its unorthodox sales channels and advertising strategy, if you're going to get a new phone in 2010, you should give the Nexus One a very close look.  Of all the gadgets I've ever owned, it is by far my favorite.

via Rick’s Posterous

“AAA Indie” Game Development

Notice that an independent game development studio, Zero Point, is working with the community fairly closely to produce a space shooter called Interstellar Marines.  They are funded largely by pre-sales of the game, and to maintain credibility, they release video and game demo material so the community can see their progress.  Their “Bullseye” training area demo is playable on Mac and PC on the web because they are developing the game using the Unity engine, which is cross-platform and runs in a browser.  They don’t currently have, and don’t plan to have, a “publisher” in the traditional sense, which is kind of exciting; the people working to make the game are the ones who will distribute and profit from it.  There’s something pure about the simplicity of it.  And, true to Zero Point’s claims of Interstellar Marines being an “AAA” title, it actually looks pretty good!

Bullseye Demo Screenshot

After 10 Years, Trying Fedora Again; Wireless Fails and Fixes

I started using Linux back in 1997 when a buddy and I put Slackware, floppy by floppy, onto a Pentium 133.  I didn’t start using it full time until a couple years later, when Microsoft started requiring online registration with product keys.  Since then, I’ve used Red Hat, Gentoo, Debian, Arch and Ubuntu.  I abandoned Red Hat because of “dependency hell”, which the other distributions avoided entirely.  So, it’s been about 10 years, and as I’ve tracked Red Hat’s dismissal, and subsequent rewarming to, the desktop market, I have noticed that Red Hat, Fedora Core, and finally Fedora have learned from Ubuntu and made a better experience for the casual user.

So I installed the just-released Constantine to see how it measured up to my beloved Debian-derivatives I’ve been using the past few years.  I installed it on a netbook that I love: the Asus EEE 1002HA.  The verdict?  Yeah, it just works.  It has a smooth boot up, detected screen resolution well, has good sound support, allowed me to customize tap-to-click and scrolling on the touchpad, and allowed me to set up dvorak and rebind caps-lock to control simply and easily.  What didn’t work so well?  The Atheros AR928x chipset, which is why I’m posting this.

Short version: the Atheros AR928x is supported by the open-source code contributed by Atheros to MadWifi under the driver name “ath9k”, which is the kernel module Fedora 12 loads to support this hardware.  Unfortunately, the version included is a bit behind the times, and causes odd intermittent connectivity issues when associating with access points, which effectively makes the wireless on the machine next to useless.  The good news is that it is easy to fix.  Here we go:

  1. Update.  Get root on the command line and type ‘yum update‘.
  2. Install your development tools so you can compile things.  As root: ‘yum groupinstall “Development Tools” “Legacy Software Development”‘.
  3. Head over to the download site for compat-wireless and download the latest version of the ath9k driver.  Uncompress it.  Navigate to the directory it is in on the command line.
  4. Select the ath9k driver: ‘./scripts/driver-select ath9k
  5. Build: ‘make
  6. Install: ‘make install
  7. Unload your current ath9k driver: ‘make unload
  8. Load the new driver: ‘modprobe ath9k‘ (or simply reboot)

There were lots of discussions across the web about various hacks and methods to get ath9k to play nicely with the AR928x chipset, but this seemed to be the most straightforward approach to me, and it seemed to work really well, so I thought I’d post it here in hopes it would be useful to someone.  Happy hacking!

Static Typing and Functional Languages

With a title like that, it’s not really clear where I intend to go with this post, but bear with me for a moment.

A couple of years ago, I got into functional programming, initially with Common Lisp (only a tiny bit), and then Scheme and Scala, a pinch of OCaml, and finally Clojure.  One thing I noticed about all the Lisps was that they weren’t statically typed in the way I was used to in Java.  Then I noticed that all the functional languages that were statically typed (OCaml, Scala) were also more modern than the Lisps.

Anyway, I continued working in functional languages for months, growing more accustomed to the power they offered, and changing how I thought about solving problems as I went.  Design patterns seemed much less useful, and languages that didn’t support higher order functions, TCO, and the basic map, reduce, and lambda functionality started to feel increasingly awkward.

So, as I worked in Java for much of my professional work, I was naturally drawn to investigate Functional Java, a set of libraries that offered at least a partial veneer of functional programming on top of the Java language.  Part of that library was a testing framework inspired by QuickCheck called Reductio Test.  Reductio was cool, and the examples looked simple and powerful.  Then I tried to write my own generator for Reductio.

The code was awful.  It was an explosion of objects attempting to replicate higher order functions, but constrained by Java’s type system.  As such, the types of the higher order functions were generified on types of lower-arity higher order functions, and so on, until finally all the parameters had been consumed.  I thought I must be doing it wrong, so I browsed the Functional Java source tree over on the Google code site where it is hosted.  In Function.java, I saw the same problem emerge:

 /**
   * Curry a function of arity-5.
   *
   * @param f The function to curry.
   * @return A curried form of the given function.
   */
  public static <A, B, C, D, E, F$> F<A, F<B, F<C, F<D, F<E, F$>>>>> curry(final F5<A, B, C, D, E, F$> f) {
    return new F<A, F<B, F<C, F<D, F<E, F$>>>>>() {
      public F<B, F<C, F<D, F<E, F$>>>> f(final A a) {
        return new F<B, F<C, F<D, F<E, F$>>>>() {
          public F<C, F<D, F<E, F$>>> f(final B b) {
            return new F<C, F<D, F<E, F$>>>() {
              public F<D, F<E, F$>> f(final C c) {
                return new F<D, F<E, F$>>() {
                  public F<E, F$> f(final D d) {
                    return new F<E, F$>() {
                      public F$ f(final E e) {
                        return f.f(a, b, c, d, e);
                      }
                    };
                  }
                };
              }
            };
          }
        };
      }
    };
  }

I was stunned.  What had gone wrong?  After a few minutes or hours, I realized the reason Lisp had used dynamic typing: having to explicity specify all the types for each of the higher order functions destroys the readability of the code!  So, languages like Javascript, Lisps, Python and Ruby get too avoid all that.  So how do OCaml and Scala work?

Well, a lot of work had to be done to support static typing with functional programming, and it required the development of a type inferencer to make code readable.  The end result is a sophisticated compiler that has the ability to determine the types in the code without the programmer specifying them exhaustively.  This functionality is advanced, however, and it took a long time to develop.  That is why OCaml and Scala are much newer that Lisp, for example (though it is true that OCaml and Python are about the same age).

So, suddenly it all came together for me.  I finally had a glimpse as to how the type systems interacted with the programming paradigms, and why languages made the decisions they did.  Type inferencers can be tricky to write well, so there is a trade space language designers have to navigate to get the result they’re looking for.  Dynamic typing offers simplicity, but also comes at the cost of rigor and the ability to catch entire classes of bugs at compile time, rather than runtime.  Sometimes that cost is worth it, sometimes not.  That is why, even today, there is room for dynamic languages like Python and Ruby alongside static languages like Scala and OCaml.

The Copyright Paradox

I was reading a really interesting blog post on copyrights, and this amazing paradox about copyright struck me.

Right now, the internet is changing the way text, music and video are distributed.  Part of that change is the ease with which data can be copied.  If you look at what the U.S. supreme court said about copyright, they quite literally made an allowance for data to be copied from the hard disk of a computer to the RAM of the computer, mainly because it was necessary for the use of the data.  I mean, this is the level at which we are trying to enforce copyright law.  If they could find a way to charge extra for you to copy that MP3 file from disk to memory so you could play it, I’m pretty sure they would.

But when I talk about “they”, I’m really talking about some enormous corporation.  Copyright infringement is pretty much a civil issue in most cases, and therefore enforcement is left to the copyright holder.  And that’s the paradox.  We all like to imagine that the struggling solo artist making music in his or her garage is getting screwed because their songs  are posted on Kazaa (or whatever), and that by keeping really strong copyright laws we’re somehow helping those kinds of artists.  Here’s my revelation: copyright law only helps those that have the (quite considerable) resources to enforce it themselves!  So, no matter how strong we make copyright, it doesn’t really help the poor artists — the solo writers, musicians, or filmmakers.  Because they’re busy doing what they do best: creating art.  If people want to copy their stuff, they will, becuase the internet allows it.  Those that love it will buy, those that don’t will pirate it.  Copyright is pretty much completely out of the picture, because enforcement is nowhere in sight.

So who does copyright benefit?  Well, the huge companies that exist soley to aggregate copyrights.  The music labels and maybe the game development houses.  Only the entities that have huge sums of money to track down and sue infringers will be able to leverage copyright laws in an attempt to curb piracy.  And therein lies the irony: those companies got all that money by skimming off the work of the artists they represent, so the money was lost to those artists anyway.

My conclusion?  Copyright really only helps large corporations that exist solely to aggregate copyrights.  The artists that it is supposed to reward end up doing best by embracing the sharing the internet affords us, rather than trying to fight it.

Re-thinking the Tabs Model in Modern Browsers

TL;DR version: Modern browsers need to learn a lesson from Emacs and keyboard launchers and provide an interface for tab-switching that is keyboard search-as-you-type based.  Firefox is the best browser to implement this on, since it is very extensible.

The Extensibility of Firefox

Mozilla recently hosted a Summer Design Challenge for 2009 that focused on “Reinventing Tabs in the Browser”.  Firefox is a fabulous platform to host such a challenge on, as it is the most extensible of all the browsers currently available.  Let me digress into why I say this quickly before I get into my main point (which is about tabs!)

Mozilla itself has gone to great length to expose the browser internals by making the browser open source, which allowed for modification. Next, they provided a framework for developers to write add-ons for Firefox, which provided a robust architecture for extension. The primary distinction between modification and extension is that two modifications to a piece of software may well be mutually exclusive, while two extensions are designed to work side-by-side.  These two aspects of Firefox’s feature set have been well known since its release, and the add-on (extension) community has been very active since Firefox’s release.

But, in recent months, Mozilla has released two extensions that make the process of extension itself even easier.  One is called Ubiquity, which provides a command-based interface to the browser and web, and allows for it to be extended by writing Javascript in its configuration interface (in a browser window with the address “about:ubiquity”).  This type of extension caught on quickly with fans of command-based systems like Launchy and QuickSilver, (and later, Google’s Quick Search Box, authored by the developer responsible for Quicksilver, also only available for Mac) but was otherwise relegated to a niche of the Firefox user base.  One of its primary features was the novel mechanism by which Ubiquity could be modified: simply visiting its configuration interface (inside the browser itself), and typing into a text box.  That was it.  No saving, reloading or compiling.  As soon as the code was typed, it was “live”, and immediately available within the Ubiquity launcher.  The distinction between “development” and “use” was almost non-existent.

This feature was so compelling that it deserved its own extension, independent of the concept of a command-based launcher embedded inside Firefox.  Whereas Ubiquity provided an API centered around the user interface of Ubiquity, it offered little to modify aspects of Firefox outside of the Ubiquity interface.  To fill this more general need, JetPack was born, also from Mozilla Labs.  JetPack had all the quick-development goodness of Ubiquity, but without the extra baggage of a command-based interface.  JetPack was designed to allow extensions for Firefox to be written quickly and simply, and distributed on the web with equal ease.  Though still young (JetPack has only been released for a couple of months), it provides a polished interface for development.  Its internal browser-based editor for coding is based on Mozilla’s own Bespin, and provides all the immediacy of Ubiquity development with extra goodies like line numbering and syntax highlighting thrown in.

So, hopefully at this point you understand why I assert that Firefox is the world’s most customizable browser.  My concluding thought on this topic is that this is all part of Mozilla’s effort to work towards a browser that is designed to be restarted only rarely, is extensible while it runs using a built-in interface, and allows users to change their working environment as easily as developers, because no special tools are required.  This vision is not entirely new; if you are familiar with Lisp-based environments, the REPL, or GNU Emacs, all this will sound very familiar.  So, it would seem Mozilla is learning well from successful projects in the past.

Why Tabs?  Why Not “Buffers”?

Which brings me to my main point (sorry it took so long!)  As Mozilla hosted the design challenge to reinvent tabs in the browser, many teams with a wide variety of approaches sought to provide meaningful improvements to the existing interface.  Every team did research with users by watching users use Firefox, and/or interviewing users about their browsing techniques.  Every team noticed one glaring deficiency in the existing tab interface: it doesn’t scale.  Any user of Firefox that has tried to manage 50 tabs has had trouble.  Even the interface has trouble: it tries to show many of them, and finally gives up, reverting to a scrolling tab bar.  So, when asked to improve the existing interface, every team tried to improve the tab interface’s scalability.

What I found so surprising as I reviewed the entries was that, in a community that has learned so much from Lisp and Emacs, most entries completely ignored the notion that a keyboard-based navigation system scales much better than a mouse-based system.  The trade-off, of course, is learning curve and prior knowledge.  We can attribute the meteoric rise in keyboard launchers across all operating systems to “Start Menu Clutter” – the simple problem that, when you install a few hundred programs, each wanting its own start menu item, quick launcher, and desktop shortcut, you rapidly drown in a mosaic of brightly colored icons, all alike.  But, if you know what you want (“Firefox”, “Word”, “Emacs”, or “iTunes”), you can use a keyboard launcher, type a few letters, and avoid the clutter entirely.  Such systems prohibit “browsing” of available programs, but for most users most of the time, the desired program is known in advance, rather than discovered by searching the start menu.

One entry provided effective keyboard navigation (it was called “Collapsible Tab Groups“) and it provided good scalability.  In particular, it provided two pieces of functionality that should be present in more software today, one of which comes straight from Emacs.

First, (this is the non-Emacs feature), it puts the UI elements on the side of the screen, rather that the top or bottom.  Why is this so good?  Most screens are widescreen.  Heck, even the 1600×1200 screens are wider than they are tall, but even those are going out of style to favor wider aspects (I’m typing this on a 1680×1050 display).  So, given the fact that displays are wide, but most material we read is tall (think word documents, most PDFs, and almost every web site), do developers insist on sticking every interface bar on the top or bottom of the screen, further constraining the already-limited vertical dimension?  Well, collapsible tabs groups gets it right, and puts the tabs on the side.

Second (this is the Emacs feature), it provides “search as you type” for navigation.  What does this mean?  It’s the kind of behavior you get in desktop keyboard launchers.  If you type “irefox” or even “ire” into Launchy or QuickSilver, you will get “Firefox” as the launch option because that substring matches the “ire” in “firefox”.  This is the feature Emacs uses to manage buffers with ido-mode.  You open the interface to change buffers (files, roughly) in Emacs and it presents you with a small interface that lists just a couple of the (possibly tens or hundreds of) buffers you can switch to.  As you start typing, each character is added to the search criteria as you type, and the list is refined to reflect the relevant buffers.

This approach works perfectly in a browser.  The problem with Mozilla’s challenge is that they asked people to reinvent tabs; the realization is that we don’t need tabs.  “Tabs” implies a user interface, which should be customizable.  What we really want is a model for navigation, and a set of pluggable user interface elements that attach to that model.  One user interface might be tabs (for users that keep 4 or 5 websites open at a time), but another could be a search-as-you-type mode that doesn’t use any screen real-estate when it is not in use.  There has been a widespread conflation of view (the tab interface) and model (the idea that we have multiple web sites open) in every browser.

The obvious answer is that I should get in gear and write a keyboard-based tab switching interface using JetPack (Ubiquity has one, but it is painfully slow to use for a variety of reasons).  I don’t know…maybe I just convinced myself that I should.

Media and the Sense of Shared Experience

There’s something about our experience that defines us.  You meet someone new, and if they’ve had the same frustration at the checkout counter at the supermarket, or had the same sense of wonder standing and looking at the wake of a ship stretch to the horizon, a momentary spark forms, and a connection is made.  You may know nothing of their values, or their habits, or their dreams, but, in some small but noticeable way, you suddenly share something.

But these connections can now be made in many new ways.  Over the phone, over email, even over identi.ca or Twitter or a myriad of other ways to connect over the internet.  All of these represent a real shared experience of some sort.  But “experience” is a broad concept.  For example, the media (the news, the music industry, the movie industry, and the authors and journalists)  sell an emotional experience, not just information.  Movies (and often music) offer high-powered escapism, a way to journey to a different place and experience emotions that are not part of your normal life.  Books are perhaps even more this way, though they vary so much in content that it is hard to generalize (novels certainly are very much like movies, but what about The Seven Habits of Highly Effective People?  Hardly escapism…)  So, in this realm of mass-published books and broadcast movies and music, suddenly we can get the feeling a shared experience which may not really be shared at all.  You might see a movie in a cramped living room on a standard-definition television with 12 friends, and I might have seen it in a high-definition IMAX theater.  Not quite shared in the same way we often think of it.

Because of this, even though we may have nothing in common with someone, you might hear them say they saw the same movie as you, or read the same book, or saw the same news article, and suddenly this connection is formed.  Your opinions might differ, which can make things awkward, but many times you can find yourself talking with a perfect stranger about their opinions on All The Days Before Tomorrow, thinking that you share something significant because you both saw the film.  The funny part is, maybe you do share something significant because of that.

Of course, after this happens some magical number of times, you are no longer perfect strangers, and are instead good friends.  Perhaps just because you see the same movies, read the same books, and share your thoughts with one another.  Discussion of the faux-shared-experience becomes the genuine shared experience, and the relationship suddenly becomes real.  Life is funny that way.

Software as an Investment in an Ecosystem

Here’s the thing about using commercial (proprietary) software: you are not buying software, or even software-as-a-service.  You’re buying into an ecosystem.  Let me quickly provide examples, rather than pontificating endlessly.

Take the iPhone.  I really like the iPhone.  A good friend of mine just bought a shiny new iPhone 3G S.  She was moving from Verizon, and had used their oh-so-handy Backup Assistant to backup her addresses, phone numbers, emails, etc.  The problem was that Verizon doesn’t really have a way to get the data out of their system.  So, it’s great so long as you stick with Verizon, but as soon as she wanted to switch to AT&T and grab her contacts, she had a problem. (For the curious, I actually extracted the data from her phone, a Motorola, with BitPim, exported from BitPim to vCards, and imported from vCards to Apple’s Address Book, at which point I recommended taking the same vCards file and importing it into Google Contacts as a nice central storage point that does allow you to export.)

The point here is that she invested the time to input roughly 250 contacts into a proprietary system but didn’t have 1) an exit strategy or 2) a plan to keep the system forever.  This is common – when you invest a ton of time in a piece of technology (either by inputting data to it, or learning it) you need to have a plan for what happens to your data (or your skill) if that tool goes away for some reason.  Google Contacts was a good replacement: it is accessible via the web and exports to three different (plain-text) formats that are widely used and can be backed up locally.

Second example: investment of time in learning a proprietary piece of software.  Another good friend of mine is very interested in photography and image editing.  So, he, being motivated, took a class to learn Photoshop and even got the educational discount on Photoshop when he took the class.  The $500 or so isn’t that big of deal, but coupled with the hundreds of hours learning the tool, starts to become a major investment of the only two things we have in life: time and money.

Here’s the problem: the one thing you can be sure of is that you will get a new computer.  People like me upgrade more often than most (every 18-24 months in my case, and I’m not even a gamer), and probably own more regularly-used computers than most (desktop, laptop and netbook).  So the problem becomes apparent: If I invest a few hundred dollars and hundreds or thousands of hours in learning Photoshop (or Excel, or whatever), I have to worry about having that tool wherever I am (on my desktop at home, on my laptop at work, on my netbook at the airport) and whenever I am (today, a year from today, 10 years from today).  And that is when the cost mounts: you are willing to spend lots of time to learn how to use the tool, but that investment represents a hook.  Now, you have an enormous incentive to continue spending money to buy new versions of the software for all your machines that span the space and time you occupy.

So, when my good friend bought a new computer, he had the problem of getting a new copy of Photoshop for it, because he was now heavily invested in the ecosystem surrounding the tool.  He knew how to use the tool, but he also relied on the vast pool of resources on the web related to Photoshop: tutorials, plugins, templates and interoperability.  So, the investment isn’t just his time, his money, and his skill, but also an expectation that certain associated resources surrounding the product will be there for you, as a user.  This is the product’s ecosystem.

I’m not saying you shouldn’t invest in proprietary software systems; I’m saying you should do so with your eyes open to the true costs of the investment – don’t just look at the ecosystem that the iPhone has and think “Everyone has one…and it has 100,000 apps, it’s the way to go!”  Recognize that your are now necessarily going to be compelled to use all the associated iPhone tools, and you will invest yourself in learning how to use them.  Many commercial software vendors recognize this effect: there is a good reason that Microsoft gives away most of their products freely to students.  These companies realize that your willingness to spend the time to learn new tools will trump your desire to hang on to your hard-earned dollars. That is the strategy to keep the money flowing.

So, what can you do when you see this happening to you? You can change your expectations. You can treat the commercial software as a single point in a space of solutions to your needs, and keep an open mind about switching. Learn the tool from the perspective of approaches it offers to solving problems, and try out other (perhaps Free) tools with which the same approach can be applied. In the case of Verizon, you could look at Google Contacts, Plaxo, Thunderbird, or Apple’s Address book. In the case of Photoshop, you can, depending on your needs, look at The Gimp, Picasa, Paint.NET, iPhoto, Paint Shop Pro, or Aperture. But, when you pick a solution, recognize that your computer and your needs will change, and be prepared for it.

Use of Mutable Data Structures in Clojure

One of Clojure’s biggest strengths is that it is backed by the JVM, and has good interoperability with the Java libraries.  When I needed to implement a simulation with events that should be executed in order of their timestamp, I was immediately tempted to use Java’s PriorityBlockingQueue, since neither sorted-map nor sorted-set supported two events with the same time stamp (though I could have used some magic to make it work, but decided simplicity was more important).

I was generating events with several types, and under certain circumstances, events that mapped :type to :arrival needed to be removed because they were no longer valid given the state of the simulation.  Easy enough:

(filter #(not= (:type %) :arrival) event-queue)

But, PriorityBlockingQueue is mutable (oh, and I’d put it in a ref, to make things more interesting), so I’d need to clear it after I did the filter, before repopulating it with the filtered events.  This is a simpler version of the final code:

(let [elements (filter #(not= (:type %) :arrival) @event-queue)]
(.clear @event-queue)
(.addAll @event-queue elements))

So, when I ran this, I kept getting NullPointerException.  Clojure is not well known for its uber-informative stack traces, so I flailed with this for a few hours, looking at it on and off while working on other parts of the code.  Then it hit me: filter is lazy, so the filtering of the elements is not happening until I’m calling in the filter when I make the addAll call.  By that time, @event-queue is empty, so the list of results is nil, which will throw a NullPointerException when passed to the addAll call.  The easy way to fix this is to wrap the call in a doall function, which forces the computation:

(doall (filter #(not= (:type %) :arrival) event-queue))

The moral?  Even though Clojure has good support of Java collections, be wary when you use mutable data structures with the standard Clojure seq functions; many of them are lazy, and unless you know they won’t change, you can get subtle temporal errors creeping into your code.  I got lucky that addAll threw the exception.  If it hadn’t my simulation would have ended and I would have spent a lot more time trying to figure out why.

Beauty in Programming: Part I

Introduction

When I program, I sometimes am fortunate enough to see beauty emerge from what I do. I love programming because of this, but, enthusiastic as I am about it, I find that it is very hard to convey the essence of what I see to others, which often means that they cannot understand what I am so enthusiastic about!

So, I would like to discuss one of my favorite idioms in programming, but before I do so, I would like to discuss something you might be familiar with: organizing information. In the context of information organization, I think my example makes sense, and the beauty arises, and I don’t have to discuss programming at all. This is ideal, because the beauty in programming has nothing to do with programming at all: it has to do with patterns of thought. But I’ve already delayed too long; off we go!

Organizing Information: Hierarchies

In our struggle to store and retrieve information, we have developed systems to help us organize it all. Probably the most prevalent idea in these systems is the notion of a hierarchy. We see this everywhere, from filing cabinets, to file systems on a computer, to organization of library books, to notes taken in an outline form, to the system for classifying life on Earth, and even in this very essay. The idea is simple: establish a set of top-level categories. Each of these can contain either more categories, or items. Thus, any given item has a location in the hierarchy that is defined by all of its parent categories.

This idea of classifying information in a hierarchy is so pervasive that it comes completely naturally to our way of thinking. It has three main strengths:

  1. It allows us to locate information using a broad idea, and narrowing that idea until we finally locate the information we seek
  2. It allows us to easily add new information in the structure, because we have a pretty good idea of where things should go
  3. It allows us to reorganize the ideas, because we can take sub-categories and change their parents as necessary

But it has one major weakness: if a piece of information has multiple facets, it can really only exist in one place in the hierarchy, so users of the system are forced to choose. There is a rich history of developing add-on aspects to hierarchical systems to fix this weakness. In computer file systems used by Linux, Solaris, BSD or Apple’s OS X, for example, a user can establish links in the file hierarchy to allow a file to exist in multiple locations at once. This would be a bit like putting a yellow sticky note in a filing cabinet in the “Hobbies” section that says “Go look at the Automobiles section for info on my hobby sports car.” This works fairly well, but an unsuspecting user might remove the file on the sports car after it got totaled, only to find that little sticky later. Meanwhile, if your brother looked under “Hobbies” to find info on the sports car, he’d be confused when he couldn’t find it where the note indicated. So, it makes maintenance of the system a bit harder, and can confuse users.

Organizing Information: Tags

Nevertheless, hierarchical systems have such great advantages that they are still the most prevalent approach to solving the problem of information organization. Recently, however, another notion has come along: tagging. Tagging is a simple system: each item of data can be assigned a tag, or keyword, that describes some broader category to which it belongs. Each item can have an unlimited number of tags. Going back to our example, the information on the sports car could be tagged “Automobile” and “Hobby”.

This mechanism for organizing information is used in several well-known environments: Google uses it in their bookmarks implementation, Facebook uses it to organize photos and notes, Flickr uses it for photo organization, and many news sites (like Digg and Slashdot) tag news stories. The advantages are obvious: tags allow a fragment of information to exist in multiple places in the hierarchy. But wait, where is the hierarchy now?

The first question many newcomers to tagging ask is how to really organize things. As soon as you try to establish a tag hierarchy, you start to notice the problem. Suppose you had a tag “Cars” and then another for “Sports Cars”. If you wanted to add some information about the discontinuation of the Dodge Viper into your system, what tags would you add? Both “Cars” and “Sports Cars”? Or maybe just “Sports Cars”, since that implies “Cars”? But then, when someone looked for all information on “Cars”, the info on the Viper wouldn’t show up, which doesn’t seem right. Then again, is this system going to require that we add every possible tag when we add information? That seems cumbersome.

So, in exchange for solving a problem, tagging introduces another. We can now freely associate bits of information with tags, but the tags have no hierarchy. When we look back at the strengths of a hierarchical system, we see that we can still locate information using a broad idea. If we look at all information tagged “Cars”, we can then see that many of those items are also tagged “Sports Cars”, so we develop a sort of emergent hierarchy. This concept is often referred to as a “tag cloud“. So, at least tagging systems allow to locate information in much the same way as we could using a hierarchy, and they have the added bonus of allowing us to file data in multiple places.

We’ve said that hierarchical systems are good at adding data. Unfortunately, adding information to a tagged system can be difficult. For each item we want to add to the system, we have to identify all relevant tags when we insert the information. This means not only knowing all the relevant tags in the system, but also anticipating whether any new tags are needed. This could be even more work that creating the data we wanted to insert in the first place!

Finally, we’ve said that hierarchies are good because they allow us to reorganize information. This can also be difficult when we use tags. In addition to collecting cars, we also start collecting motorcycles. So, we’d like to introduce the notion of “Vehicles” to our system, and have it apply to all of our cars, along with the new motorcycles we’ll be adding. In a hierarchy, you’d simply create the “Vehicles” category and put “Cars” into it. With tags, we have to go back and add the tag “Vehicles” to all items tagged with “Cars”. Laborious, to say the least!

Are tags really that bad, or have we missed something? Do we need to build some tools to automate all the cumbersome aspects of using tags? Or is there some underlying truth here that we have failed to recognize?

“Simplicity does not precede complexity, but follows it.” -Alan Perlis

In many tag systems used on the internet, the problems I’ve outlined above remain. The main reason is that it is difficult to formulate elegant systems to address these problems. The primary job of a computer programmer (whether they know it or not!), is to take problems like the one I described above and try to find the underlying model that should be used. You might get an itch in the back of your mind that something is wrong with what I described: tags seems like they should work, but for some reason are difficult to use.

Your first instinct may be to write a program that will automate the more cumbersome aspects of using tags. For example, in the case of adding a “Vehicles” tag, you could just write a program that would grab all the items tagged “Cars” and add “Vehicles” to them. Seems simple enough, and saves work. But it isn’t beautiful. Over time, you’d write a lot of tools to help you perform tasks like that, and the system would become complex. We wanted a simple system when we started using tags, not something that would be hard to understand and use. The key question that is always on my mind is: How can we retain simplicity?

In the systems we’ve been discussing, we have defined two elements. In the hierarchical system, we defined categories and items. In the tagged system, we defined items and tags. What if we tried to define the system using only one element? What if, in the tagged system, we defined tags as items, and items as tags? How would this change its utility?

Such a change would mean that items could be tagged (as before), but also that tags could be tagged, since they are also items. Like many subtle changes, it is hard to see how this small modification would be useful. In fact, it allows us to enjoy the best aspects of both a hierarchical and a tagged approach to organization, and it means the entire system is only composed of one element: a data item. What we’ve done is change how these items relate to one another, and that enables new patterns of thought and organization.

Going back to the example of the sports car that is both a “Car” and a “Hobby”, we can see that this system would support both through the normal use of tagging. But what about our example of “Cars” and “Sports Cars”? Well, in our new system, as soon as we create the data on the Dodge Viper being discontinued, we can simply tag it with “Sports Cars”. Why don’t we have to worry about tagging it “Cars” as well? Well, as soon as we create the tag “Sports Cars”, we tag it with “Cars”. So, now we have embedded the knowledge that “Sports Cars” and “Cars” are related in the tags themselves, rather than in each item using those tags.

Using this idiom, it is easy to see that hierarchies naturally emerge from the structure, but can be more complex than simple trees, due to the nature of tagging. Now, tags can relate themselves to multiple other tags at different points in the tag hierarchy, as can items, because items are tags. Likewise, if you want to provide some notes on a tag, you simply add data to it, because the tag is a data item. Reorganization of the hierarchy is easy – if you want to introduce the notion of “Vehicles”, simply tag “Cars” and “Motorcycles” with “Vehicles”, and you’re done. All items with those tags (or tags tagged with those tags, etc.) will now be tied to the notion of “Vehicles”.

This idea, though simple, may be confusing at first.  Here’s another way to think about it: instead of tagging items with tags, we tag items with other items.  So, if we’re in a filing system, we tag files with other files.  If we organize photos, we tag photos with other photos.  In reality, we have reduced our system of organization to its very simplest form by allowing ourselves only one action: we can establish connections among various members of a homogeneous collection of data.  Now, each member of that collection will ultimately have both data, and connections to other pieces of data.

If this sounds familiar, that’s because it is: it is the exact methodology for organizing the internet.  Web pages sometimes contain data, and they sometimes contain references to other webpages.  Most webpages have both, even in walled gardens like Wikipedia.

Conclusion

By reducing the number of elements in our system, we increased its power, but retained its simplicity.  Our new system has all the benefits of the hierarchical system, as well as supporting all the benefits of tagging.

While I decided to write this and relate the ideas to programming, I got the idea of “data as tags” and “tags as data” from TiddlyWiki, which I used to compose this, as well. If you want to see the system I described in action, visit the TiddlyWiki homepage and download a copy and play with the tagging system: it is an elegant and powerful approach to managing lots of information.