Confusion and mystery are good things for vendors and bad things for consumers. Secret sauces are expensive. The storage industry has many examples of amazing profits borne on the back of opportunistic proprietary inventions.
The distinction between storage and data can be more difficult to ascertain than one might think. The concept of CAS (content addressable storage) is an example of how storage and data have been twisted together into a confused technology collage, creating an opportunity for vendors to print money.
At the end of the day, IT departments need to find ways to locate data. In the case of CAS, the data is placed in a special repository for historical archives and various algorithms are churned, creating resultant metadata that can be used to locate data at some later time. The belief is that preparing metadata today saves some amount of time and cost later.
Like many insurance or warranty products, CAS is paid for in advance. CAS customers pay top dollar for storage capacity that is likely going to be used very little. While there are probably some cases where customers use CAS products regularly, the question is: Couldn’t the same thing have been done at much lower cost? The answer to that question is obviously, yes. One could choose to use indexing, search or archive software tools, any of which could be more effective than what is integrated by the CAS vendor.
So, instead of placing archived data on expensive little-used CAS storage, it can be placed on low cost, little used storage where it is acted upon by any number of archiving, indexing or search technologies. The big advantage of separating software from hardware this way is that all the components of the archiving solution can be replaced if desired. If better software comes along, it can be easily incorporated. With a CAS system, customers are at the mercy of the hardware vendor to provide an update or upgrade.
The thing to keep in mind about archiving is that the function is expected to be needed far into the future, far past the lifespan of most technologies. The notion that any technology - storage hardware in particular - would have the ability to successfully manage the lifecycle of data archives is simply bizarre. The risk of having a data blackout in the future increases if succesful data access depends on the continued longevity of any particular vendor and any particular product sold by that vendor.
Archiving is going to be difficult work for many of us for many years, but it will be less difficult for those who don't paint themselves into corners with vendors looking for customers to milk with a string of expensive upgrades and services.