Mar 01, 2011

I wasn't even aware they still used tape backups.

This particular Gmail outage is proving to be particularly bad, at least for the .02% of Gmail users affected.  However, it appears the end in sight for Google.  In a post on the Gmail Blog Google (goog) reveals that it has had to go to tape backups and expect to finish restoring accounts in the coming hours.

To protect your information from these unusual bugs, we also back it up to tape. Since the tapes are offline, they’re protected from such software bugs. But restoring data from them also takes longer than transferring your requests to another data center, which is why it’s taken us hours to get the email back instead of milliseconds.

Tape?<!-- more -->

I had no idea that Google was still backing up its data with tape.  I had assumed that Google's backups were done to disks in different parts of the world.  Doing tape backups, even automated ones, is much more labor, space, energy, and obviously time intensive than disks.  But obviously tape has the advantage of not being susceptible to bad software upgrades.

In fact, having dealt with (relatively) small libraries of data stored on tape that filled up rooms, it is hard for me to imagine that Google is actually using tape.  Looking at the ol' LTO Wikipedia Page, the current high density tape is 1.5Tb uncompressed.  Assuming that each of the 200 million Gmail users use a little over 1GB each (most people I know use much more), that means Google needs about 200,000 tapes to store all of that data.

Once.  It is hopefully making incremental backups which take more space which offsets compression.

How many is 200,000 tapes?  Again assuming LTO 2+cm depth times 200,000 tapes, that is a stack of tapes four kilometers high to back up Gmail.  Ouch.

No wonder it is taking 30 hours – but thankfully it can be done at all. Stay tuned for a postmortem from Google shortly.

