E-mail Archive Economics
How expensive is it to store many terabytes of enterprise email so they are available for survey, eDiscovery, investigations, internal control and purposes of institutional memory? The raw cost of data storage is dropping, so is it not cheap to retain lots of email?
I explored this topic in an email exchange with Dennis Vlasich, Director of the IT Department at the municipality of Fontana, California. Following is our conversation.
I started by pointing Dennis to my earlier post suggesting "content addressable storage" for inexpensive retention of large volumes of e-mail.
DENNIS replied: “It [the earlier post] is basically a standard storage vendor's sales pitch and we have a SAN that can accommodate terabytes of ‘cheap’ storage. You have to factor in the rack space, power and HVAC requirements to keep the storage system running and I can assure you that because such technology [content addressable storage] is less expensive, its MTBF [mean time between failures] will be much shorter than other forms of storage, so what happens when it crashes? The bottom line is cost versus benefit versus risk.” [Update: Greg Smith of Messaging Architects responds to Dennis' comment.]
To which BEN said: “As a lawyer, here's what I don't understand. I use Yahoo as my business email provider. I pay Yahoo a paltry $20 a year for this email service, and Yahoo provides me unlimited storage. Unlimited. And since I pay for my service, Yahoo does not attach ads to my email. I store on Yahoo all my business e-mail (including attachments) dating back to 2004. (I also back it up in zip files I download onto a local hard drive.) The economics suggest that storage costs -- including all the infrastructure and management -- are just not a big deal to Yahoo. Why can't an in-house IT department at a mid-sized enterprise get anywhere close to the same economics that Yahoo (or AOL or Hotmail) gets?”
Then DENNIS said: “Yahoo probably restricts the size of your attachments to 1MB and your mail client (probably not Outlook) does not have the search and archiving capabilities of an Exchange/Outlook environment. I don't know what your SLA [service level agreement] is with them, but I would guess they aren't guaranteeing that all your email will be available forever.
"The City of Los Angeles just switched from an in-house mail system (using Groupwise) to Google's gmail for 30,000 users. Google charges $50 per year per user for up to 250 gigabyte of disk space, however they do not have any archiving capabilities yet.
“Google, Yahoo, and other large providers are using data center technologies on a massive scale that brings the cost per MB down below what those of us in smaller shops can come even close to. The big problem they have is security. Google had to guarantee LA to use dedicated servers and storage for all their email so it wouldn't co-mingle with their other customers, and for $7.5MM Google said OK. Most of the rest of us cannot negotiate on that scale.
“The administration overhead for managed disk space is quite high because it requires routine backups with take time and resources which all cost money.”
BEN responded: “My attachments, both incoming and outgoing, are limited to 20MB, which is reasonably large. The mail client I use is Yahoo's web mail page. It provides quite a few search capabilities. I have my 'archives' arranged as large chronological zip folders on my local hard drive. Still, Yahoo stores a full copy of all my emails back to 2004.”
DENNIS: “[Regarding the 1MB attachment limit] I guess I was thinking of the free version [of Yahoo mail]. It's been a while since I looked at those kinds of mail services. We're looking seriously at Google Apps with gmail (waiting to see how things work for LA). The 250GB limit might be a problem for some of our users, but I figure by the time we're ready to move, there will be more space available or they will have an archiving feature.”
Update July 2010: The General Services Administration has certified Google Apps as qualified for use by government agencies. Typically, a US government implementation of Google Apps would keep data on servers in the continental US.
–-Benjamin Wright. A senior instructor at the SANS Institute, Mr. Wright regularly serves as a public speaker at lunch and dinner meetings for local professional groups, such as IIA and ARMA.