Is Archival Affordable?
As a mere lawyer, I struggle to understand from a technical perspective how to implement e-mail retention policy in a larger organization across many years. Legally, the organization may – due to litigation hold or other factors – have incentive to store some (maybe a lot) of email a long time, more than five or seven years.
Legally speaking, the biggest mistake with e-mail records is to destroy them too early. That could be spoliation. So why not delay destruction a long time?
In theory long-term storage of much e-mail should not be expensive because the cost of raw storage is dropping. However, IT folks in my SANS courses argue that the cost of storage media is not the key issue. They argue the overhead associated with massive storage is high (they bring up things like expensive disk arrays).
As a mere lawyer, I struggle to understand from a technical perspective how to implement e-mail retention policy in a larger organization across many years. Legally, the organization may – due to litigation hold or other factors – have incentive to store some (maybe a lot) of email a long time, more than five or seven years.
Legally speaking, the biggest mistake with e-mail records is to destroy them too early. That could be spoliation. So why not delay destruction a long time?
In theory long-term storage of much e-mail should not be expensive because the cost of raw storage is dropping. However, IT folks in my SANS courses argue that the cost of storage media is not the key issue. They argue the overhead associated with massive storage is high (they bring up things like expensive disk arrays).
~~CALL FOR SPEAKERS AND SPONSORS - SANS INSTITUTE CONFERENCE ON IT RECORDS POLICY SEPTEMBER 2010.~~
Is cheap, slow storage technically feasible? As I craft records management policy, I’m not looking for something that pulls up records quickly. I’m looking to store records that the enterprise rarely accesses, if ever. I’m just looking for something that is cheap and prevents the enterprise from erasing something it might eventually need.
I consulted Greg Smith, email archival expert at Messaging Architects. Greg introduced me to “content addressable storage.” According to Greg, “CAS stores records, like archives, that are not to be updated. It takes advantage of the ever-declining costs of media and hardware. It avoids the operating system and management overhead associated with conventional disk storage. It stores records in an evolving ‘data cloud,’ where you add and remove generic, off-the-shelf hardware as desired. It avoids the need and expense for backup. Today, an organization can store a gigabyte of data for $10. Tomorrow the cost will be lower.”
Messaging Architects sells a version of content addressable storage, which it calls M+SecureStore. SecureStore supports a cluster approach to hardware, where devices are efficiently added or removed as needed. It could allow an aggregation of data records for many years, while eventually allowing records to be destroyed in large chunks. For example, SecureStore might keep e-mail archives aged four to 10 years. Then, as all the records from the 10th year complete the 10th year, they could be destroyed as a group.
Content addressable storage might be used either in-house or in the external cloud (where the service level agreement calls for relatively slow system response in exchange for lower cost).
–Benjamin Wright
A legal issues instructor at the SANS Institute, Mr. Wright is a strategic advisor to Messaging Architects, experts in ediscovery audit. Here is an ARMA podcast describing Messaging Architects' work in e-mail archiving, including its workshop (led by Mr. Wright) for development of an email records policy in an enterprise.
Is cheap, slow storage technically feasible? As I craft records management policy, I’m not looking for something that pulls up records quickly. I’m looking to store records that the enterprise rarely accesses, if ever. I’m just looking for something that is cheap and prevents the enterprise from erasing something it might eventually need.
I consulted Greg Smith, email archival expert at Messaging Architects. Greg introduced me to “content addressable storage.” According to Greg, “CAS stores records, like archives, that are not to be updated. It takes advantage of the ever-declining costs of media and hardware. It avoids the operating system and management overhead associated with conventional disk storage. It stores records in an evolving ‘data cloud,’ where you add and remove generic, off-the-shelf hardware as desired. It avoids the need and expense for backup. Today, an organization can store a gigabyte of data for $10. Tomorrow the cost will be lower.”
Messaging Architects sells a version of content addressable storage, which it calls M+SecureStore. SecureStore supports a cluster approach to hardware, where devices are efficiently added or removed as needed. It could allow an aggregation of data records for many years, while eventually allowing records to be destroyed in large chunks. For example, SecureStore might keep e-mail archives aged four to 10 years. Then, as all the records from the 10th year complete the 10th year, they could be destroyed as a group.
Content addressable storage might be used either in-house or in the external cloud (where the service level agreement calls for relatively slow system response in exchange for lower cost).
–Benjamin Wright
A legal issues instructor at the SANS Institute, Mr. Wright is a strategic advisor to Messaging Architects, experts in ediscovery audit. Here is an ARMA podcast describing Messaging Architects' work in e-mail archiving, including its workshop (led by Mr. Wright) for development of an email records policy in an enterprise.
I vehemently disagree with this sort of approach. It all starts with the term "archival". What is that? A noun? An adjective? How about we use the technical term properly? "Is archiving email affordable?" Last time I checked, the proper usage of that word was as an adjective. (And yes, I am a fallen-away archivist.)
Beyond that, you are classifying a medium of transport and storage with a retention period that places the same value and weight on all documents. In my mind, doing this is the same as setting a retention period for "paper". It is no different, except that it promotes laziness and happens to leverage technology. The outcome is simple for the end user because there are no boxes to pack and all that storage is out of sight and mind. So following your advice, we keep every piece of crap that shows up in email for the same amount of time as real records of the organization -- oh, and those real records that actually have to be retained for seriously long periods of time (say, intellectual property records) will either be discarded long before their real retention period or force garbage to be retained for entirely too long. And honestly, if you're going to make a case for retaining all the crap that shows up in email, you better make the same case for every piece of paper -- particularly those pieces of paper that come in the mail, junk mail or not.
So tell me, what is the correct retention period for email? You mention 5, 7 or 10 years. Which one? How do we choose? What about my patent records? What about the spam and personal email that slips into the repository?
This is great business for outside counsel. Those cheap gigabytes become terabytes, become petabytes -- and all will likely have to be searched at some point at extraordinary rates by counsel.
Posted by: Patrick Cunningham, CRM | October 05, 2009 at 11:09 PM
Patrick: This topic makes me humble! I whole-heartedly welcome your comments. Your comments are thoughtful and logical. I cannot simply dismiss your arguments. The difference between you and me is not a difference of knowledge or logic. The difference is one of judgment . . . instinct . . . guess-work.
Here's my question to you: Show me a case study in which important people like executives have (over an extended period) successfully, properly sifted through their e-mail and put all their e-mail into the right categories for retention or destruction. I am not aware that such a case study exists, and I have searched for it in earnest.
The reason that such a case study does not exist is this: important people like executives are lazy. Their laziness is an empirically-proven fact. See the story about lazy executives at Philip Morris http://legal-beagle.typepad.com/wrights_legal_beagle/2008/10/print-and-retain-e-mail-disaster.html.
(I use the words "lazy" and "laziness" in the same sense that you use the word "laziness". It means, "not taking the time and not devoting the attention necessary to understand and successfully apply a record retention policy that requires different content to be analyzed and stored according to different retention categories.)
I thank you, Patrick, and I invite more quality commentary like yours! --Ben
Posted by: Benjamin Wright | October 06, 2009 at 10:46 AM
Patrick asked these good questions:
>>So tell me, what is the correct retention period for email? You mention 5, 7 or 10 years. Which one?<<
There is no perfect, one size-fits-all answer. Furthermore, there never has been and never will be a perfect, one-size-fits-all answer for any kind of record, even a record of a defined, specific content, such as a "contract" or an "invoice." Even though some authority (such as a government archivist speaking to state agencies) may recommend or even sorta require that a record like an invoice be destroyed at, say, year 4, there is always the possibility that spoliation or obstruction of justice law will over-ride that recommendation and require a longer, ill-defined retention period.
So back to Patrick's questions: >>You mention 5, 7 or 10 years. Which one? How do we choose?<<
The answer is always an imperfect judgment call. One factor to consider in making that call is the litigation profile of the enterprise in question -- how often does it get sued and for what? A second factor can be the cost of retention -- which is the topic of the original post above. A third factor might be the role of the person who sent or received the email. For example, it may make more sense to keep the CEO's email for 10 years but a blue-collar employee's email for only 2 years.
>>What about my patent records?<<
That's a difficult question. Maybe the answer is a long, long time. I recently blogged about the patent case Phillip M. Adams & Associates v. Dell Inc., where the court sanctioned a company for destroying its records years before litigation was filed or specifically threatened!
>>What about the spam and personal email that slips into the repository?<<
To me, a spam filter is a sensible method to minimize spam in long-term archives. As for personal email, one approach is to tell employees that they should take all of their personal e-mail to their smart phones. Today, most employees who work on computers in the office also own personal smart phones that can handle their personal traffic.
--Ben
Posted by: Benjamin Wright | October 06, 2009 at 03:57 PM
Ben,
Thank you for the response. The issue continues to vex me because this problem will get far worse for most companies before it gets better. The problem with electronic records is that there is really no clutter to drive fire marshals batty and the problem is relegated off to the side as an "IT problem". Unfortunately, when IT starts to deal with the issue, they don't see "cheap storage" as the solution; they see mounting KTLA (keeping the lights on) costs as the problem. They also see one size fits all deletion ("if it hasn't been accessed in two years, delete it") as a solution. Spoliation is not something that they understand.
Couple in divergent court rulings and the sea change to "cloud" storage and services (which the courts really don't quick "get" yet), and this problem becomes far more difficult. Then you add in the anarchy of the user community and the web 2.0 blurring of work and personal life. It is a mess that few of us may ever really resolve.
While I would love to ban personal email at work, I doubt that we could ever effectively police it. But the real problem is that people are tending to push work email to personal accounts because they find access to their work email so unwieldy. What they really want is what Google offers -- a single repository for everything, easily available anywhere. That is a records manager's and security manager's nightmare and Google is very slow on the uptake of the security, retention and e-discovery needs of the enterprise, but stressed IT departments see the cloud as the solution to their problems.
So do we need a fundamental re-thinking of our approach to records management? In some ways, yes. But I also think that we need to require some fundamental changes to the regulatory and judicial environment that today effectively require overly complex solutions to setting retention periods and avoiding spoliation charges. But the courts also have to get past the approach that somehow e-discovery is different than paper discovery or that e-records are different than paper records. The process is discovery and records are records. Consistency and simplicity are what is needed.
Posted by: Patrick Cunningham, CRM | October 06, 2009 at 10:16 PM
Employees can be informed that if they mix their personal and business communications, they risk having their personal privacy, equipment, messages, and accounts tied up in business-related lawsuits and investigations. http://legal-beagle.typepad.com/wrights_legal_beagle/2009/05/e-mail-records-on-home-computers-and-personal-blackberries.html Employees have lots of incentive to keep personal separate from business. --Ben
Posted by: Benjamin Wright | October 07, 2009 at 06:25 AM
A little late on the comment back...
I carry two cell phones because I understand this issue. People think I'm crazy to carry two phones. But I have been involved in matters at the day job where we had to ask an employee for their personal computer or storage device. My forensics analysts then get exposed to all sorts of things, simply because the employee couldn't separate work from personal. And much of that is very embarrassing, and in some cases, leads to family problems or other issues.
Problem is, the problem will get worse long before it gets better, if ever.
We now have devices which encourage employees to deliver work and personal email to the same device. That same device can be used by the employee to Tweet or Facebook both work and personal information. Larger screens and bigger keyboards will make these devices much more than text pagers or SMS messengers. They are full-featured computers in small form factors and that will get them brought into litigation.
And what I find amazing in all of this is that we're moving towards a more privacy-oriented society, but people don't seem to understand that when they mix personal with work in an electronic device, a whole lot of their personal lives may end up on the big screen in a courtroom.
Posted by: Patrick Cunningham, CRM, FAI | November 02, 2009 at 11:33 PM