![]() May 13, 2004 edition |
|||
Email Storage and Archiving - Legal Issues and Best Practices I read an article recently indicating that many corporations purge their email records regularly to avoid any legal obligations or consequences in the future. So, at the last local Windows User Group meeting, I conducted an impromptu survey to gather a quick statistical cross-section of the practice. I learned that a number of corporations in the area enforce some policy or another to automatically and systematically purge emails at a defined interval. This predefined deletion schedule ranged anywhere from 90 to 365 days from the date that the email was sent. On the other hand, there were an unbelievably low number of individuals who believed that their employers kept the records indefinitely. Whereas I can understand the rationale that email, like any other method of communication, has a shelf life, I find it hard to grasp why any corporation would even allow such documents to be irreversibly destroyed. Again, I asked the question and the answers I received were disturbingly similar to the message contained within the original article that I had read. As I understood it, the common rationale was that if there were no emails backed up anywhere, they could not be produced later to any legal entity thereby minimizing any legal backlash for the corporation. One particular gentleman put it best. "If there is nothing to find, there is nothing to give." None of the Information Technology professionals present at the meeting mentioned anything related to the cost of storage or backups. Not one! Now, I am no legal expert. But, even as a Systems Administrator, I get the feeling that this is a short-sighted approach to records management and could easily cause severe long-term legal consequences. I always thought that it would be better simply not to violate any regulations rather than to try and cover one's tracks after the fact. But, as I said, I am no legal expert. I did however read a summary of the Sarbanes-Oxley Act of 2002. The Act itself is organized into eleven titles, although sections 302, 404, 401, 409, 802 and 906 are the most significant with respect to compliance and section 404 seems to cause most concern to many corporations. However, the part that really caught my attention was Section 802 which lays the groundwork for prison sentences of up to twenty years for deliberate actions such as email deletions. Under the guidelines put forth by the Act, individuals who purge data such as emails "in contemplation" of an investigation or "matter" that does not yet exist may be at risk if a jury were convinced that the intent behind the action(s) was to "impede, obstruct or influence" such future matters. Clearly, these system administrators that I had "surveyed" had no idea of the potential consequences of their employer's policies. Furthermore, I am positive that their employers wouldn't be too happy if they learned that their policies were being openly discussed by their trusted employees but that's another matter. In light of the recent Enron scandal, the Sarbanes-Oxley Act itself as well as good old common sense, I would argue that public companies would be well advised to mandate some sort of long term email retention policies even if they were specific to business units and/or key employees. Surely there must be products out there that could enable system administrators to define email destruction and retention parameters for email data. I thought that such a product would be welcomed by the senior executive staff at just about any publicly traded company. So, I set out to find out just that magic bullet. What I found was that whereas there are countless mechanisms to back up retain email, there really are two basic approaches to pretty much any data backup - either keep it on the end-workstation or to backup directly from the messaging server. Either approach raises concerns about the usage of storage as well as legal accountability not to mention endless technical arguments and counter-arguments. For example, the storage on a desktop system is close to (if not less than) a dollar per gigabyte but on the other hand, a backup of all these systems "dumped" onto a central back-end high performance SAN or NAS would be easily more than five times the cost in disk space alone. Multiply that by the number of employees in a corporation and add the overhead of managing such a data store and the equation becomes considerably uglier. For server side storage, there are other concerns about slowing down the electronic jugular artery of any institution. Then there's that classic Shakespearean question: to use a client or not to use a client? I continued my search and found countless companies offering one product after another. Invariably, the implementations boiled down to the two basic or core philosophies we discussed earlier. One company that caught my attention was Connected Corporation headquartered in Framingham, MA. Their implementation was intriguing because not only do they leverage full and incremental backups, they also employ a technique that they have named SendOnce. This technology prevents multiple copies of the same data (including email attachments which make up for the bulk of email size) from being transmitted to the Data Center. SendOnce saves common files the user's archive set as well as a SendOnce pool. Subsquent copies of this the files are assigned a pointer to the SendOnce pool, preventing multiple copies of common files from being stored in multiple user archives. Impressed by the theory, I decided to try it out. I set up a Windows XP workstation with some common line of business applications. By the time that I was done, I had about 6 gigabytes of space used up between the applications and the operating system. There was no data on the system just yet. I imaged the system and set up an identical system and backed it up. The results surprised me. The second system backed up less than 30 megabytes of data. Upon further research, I learned that after data compression and applying SendOnce technology, they average a disk space usage of 1.7 MB for every 700 MB of data! Now, that's something a cost conscious CIO could live with. The server side email archival product offering "ArchiveStore/EM" is just as unique and works without a client on the end-user's system. I believe that this approach to preventing data duplication within the archives on the back end is so elegant in its concept and so cost-effective in its implementation that it will certainly be offered in different flavors by other software vendors. It certainly does prove that the industry recognizes the importance of data retention and is quickly working to provide us with solutions for it. Athar A. Khan Executive Information Technologist atharkhan@email.com Consolidate Storage Infrastructure | Review WANb Application Case Studies | Integrate Fibre Channel and IP SANs |
|
||