Have you heard of this industry adage before?
*Backup is for recovery, archive is for retention*
If you have and you’ve already done something about it, stop reading. If, your data retention strategy is plain old backup tapes held in a dark cupboard or the back of your car, it’s time to face the facts; tape is not the long term retention medium it used to be.
Today we all face the challenge of data growth. We throw around terms like “big data” and “unstructured data” with ease and yet we don’t quite grasp how we control the impact of this on our storage and backup infrastructure and our IT budgets.
If we take a step back into the heyday of mainframe computing, hierarchical storage management was introduced to automatically move infrequently accessed data from high cost disk and volatile memory based storage locations to tape. Spinning plates of ferromagnetic film were exceedingly expensive and offered minuscule capacity when compared to the density of disk storage today, so much so that businesses and institutions were forced to move data to tape as an exercise in cost reduction.
Since then HSM has evolved into an application aware solution that can capture and archive data from many sources, file and email data are the most common examples that come to mind.
But why is hierarchical storage management still relevant even though the use of mainframe computing is now a niche in a market bulging with cheap commodity compute and storage?
– **You don’t need to keep protecting unchanged data every time you backup.**
While technologies such as compression and deduplication ease the storage costs of keeping data in production storage tiers, your backup application still needs to discover and inspect this data every time you run a backup job. Once this scales to millions of files, this inspection time reduces your available backup windows.
– **Doing nothing will increase the cost of keeping your valuable data instantly available and protected.**
Eventually you could find yourself in a position where backup times are lengthy enough to warrant additional backup servers and additional backup media. To keep your recovery time and recovery point objectives will require you to sink more of your budget into protecting your data. Every byte of data stored on production storage will have growth impact on your data protection infrastructure.
– **Keeping your data on tape for *’x’* years could hurt you financially in the event of legal action.**
This one is a no brainer when you consider the cost of a court ordered discovery of data stored on tape. Discovery requests extend not only to data active environment, they also require you to go fishing for backup tapes and other data storage devices. Consider one backup tape that contains 1 Terabyte of file data and only a portion of that is relevant. Extend this to a review and tagging of data to the last year or longer of monthly or weekly tape sets and your legal costs escalate exponentially.
– **Your archive data is searchable and retrievable.**
An archival solution allows your end users self-service options to find and access data moved to lower cost storage tiers. Regardless of the location of your data, onsite or in the cloud, users can be presented with a number of search options that range from basic to powerful search functionality depending on the investment commitment of the business. Combine this with standard ‘stubbing’ functionality where messages, attachments and files are replaced with pointers and you are then capable of delivering a user experience that is seamless with a minimal learning curve.
– **Commodity storage adoption increases data sprawl.**
As the cost of storage continues to decrease year on year, we are capable of storing more data. With storage to spare and at our disposal, we are more inclined to find additional uses for our commodity storage. In turn, as that storage is consumed, it’s simple enough to bolt in additional trays of storage to service this continued demand. This contribution to data growth is hard to control unless mechanisms are put in place that make reuse of existing storage more attractive from a cost perspective. Nowhere does this hold more true than the combination of archival with cloud storage.
Tying it all together, an archiving solution can provide significant benefit to the commercial, government and enterprise sector. In the brave new world where big data and unstructured data continue to rule the storage roost, avoiding smart management of this data will eventually hurt your business financially if it isn’t already. When evaluated against the costs of inaction and e-discovery, an archiving solution is justified in any business case.
Now is the time for you to do away with your keep everything forever approach.
*Rowan Gillson is a Solution Architect in Data Centre Solutions at Thomas Duryea Consulting.*