Feb 21, 2008

Storage Trend Predictions for 2008

Thursday, 24 January 2008
Hu Yoshida, CTO, Hitachi Data Systems


1. Controlling Carbon Emissions: With increasing concern about global warming we will see more governments impose guidelines and legislation around carbon emissions. Major corporations will set targets for reduction of carbon emissions.

A major source of carbon emissions comes from the generation of electricity. The increasing demand for computer power, network bandwidth, and storage capacity will increase the need for data centre power and cooling.The US government just completed a study which estimated that the IT sector consumed about 61 billion kilowatt-hours (kWh) in 2006 (1.5 percent of total U.S. electricity consumption) for a total electricity cost of about $4.5 billion. This is expected to double in the next 5 years. Some cities like London and New York are running out of electrical capacity and data centres are forced to relocate to other areas where power is available. This will require facility upgrades and investment in Green technology. 

In September of 2007 Hitachi ltd, announced a program in Japan that is known as CoolCentre50 which is targeted at reducing power consumption in their Yokohama and Okayama data centre by 50% in 5 years. This effort encompasses all of the groups in Hitachi including, air conditioning, power generation, IT equipment and management software. In support of this goal, Naoya Takahashi, the Executive General Manager and Vice President of the Hitachi’s Information and Telecommunications Group announced the Harmonious Green Plan which aims to reduce 330,000 tons of CO2 in the next 5 years through the development of power saving IT products like storage and server virtualisation.

2. Economic Uncertainty: The collapse of the housing market in the US, high oil prices and the falling dollar will create economic uncertainty. Budgets will be tight and IT will have to do more with less. Doing more with less will drive It to consider ways to consolidate IT resources through virtualisation, increase utilisation of resources such as server cycles and storage capacities, eliminate redundancies where ever possible through de-duplication and single instance store, and reduce the working set of production data through the aggressive use of archive products.

3. Increasing Use of Archiving: Structured data like databases are exploding as they are required to hold more data, longer, for compliance reasons. Semi structured data, like email, web pages, and document management data are increasing dramatically. Corporate email quotas will increase from less than 200 MB to 2GB in order to support new knowledge workers and compete with free mail box offerings from Google and AOL. An avalanche of unstructured data will be driven by RFID tags, Smart cards, and sensors that monitor everything from heartbeats to border crossings. The new Airbus and Boeing Dream Liners will generate terabytes (TBs) of data on each flight. All these pressures will drive the need to archive data in order to reduce the working set of production data. This will call for new types of archiving systems that can scale to petabytes and provide the ability to search for content across different modalities of data. Creating a separate archive for each type of data will not help the problem.

4. Awareness of Storage Deficiencies: There will be a growing awareness that the storage of data has become highly inefficient, with low utilisation, stranded storage, too many redundant copies, low access speeds, inefficient search, and disruptive movement and migration. Continuing to buy more of the same old storage architectures will no longer be an option. Buying faster storage processors with larger capacity disks on the same 20 year old architectures will no longer be viable. New storage architectures will be required to meet these changing demands. A new architecture that can scale performance, connectivity, and capacity, non-disruptively to multiple petabytes is required. It must also be able to provide new data and storage services like multi-protocol ingestion and common search, across heterogeneous storage arrays with centralised management and secure protection.

5. Data Mobility will be a Key Requirement: With the need for continuous application availability, IT will need the ability to move data without disruption to the application. While software data movers have been used in the past, they steal processor cycles from the application and are limited to slow speed IP links to move data. As the volume of data increases, this becomes too disruptive. The movement of data will have to be offloaded to a storage system which can move data over high speed Fibre Channel links without the need for the application’s processor cycles. This will be increasingly more important for migration of data during storage upgrades to larger and larger storage capacity frames.

6. Control Unit Virtualisation of Storage: Control Unit Virtualisation of Storage will be recognised as the only approach to storage virtualisation that can add value to existing storage arrays. Industry analysts like Dr. Kevin McIsaac, of Intelligent Business Research Services Pty in Australia points out ”The idea of being able to layer (network-based) virtualisation over existing storage arrays is seriously flawed”. He points out that this “results in a lowest common denominator view of the infrastructure, eliminating the value-added features of the array.” This type of virtualisation adds another layer of complexity introduces a performance bottleneck, becomes another potential source of failure, and a vendor lock-in. A control unit based approach to virtualisation is able to leverage all the rich functionality of the control unit to enhance the functionality of lower cost or legacy tiers of storage arrays. A control unit based approach to virtualisation will enable less capable storage systems to utilise the value added services in that control unit like data mobility functions or thin provisioning capabilities.

7. Services Oriented Storage: Services Oriented Storage will become a requisite complement to Services Oriented Architecture (SOA) in the application space and to Services Oriented Infrastructure in the infrastructure space in order to achieve the dynamic data centre of the future. SOA depends on a virtualisation layer provided by XML which enables applications to share information and utilise common services like billing. Services Oriented Infrastructure depends on a virtualisation layer provided by products like VMWare which enables operating systems to share the resources of a processor platform. Services Oriented Storage requires a virtualisation layer in the storage control unit which enables other storage systems to leverage its services like a high performance global cache, distance replication, tiered storage, and thin provisioning.

8. Convergence of Content, File, and Block based Storage Services: Instead of separate stove pipe storage systems for content (archive), file and block storage system, we will see the convergence of these storage types to a common virtualisation platform. High availability clusters of content servers and file servers will use a common block virtualisation services platform, under one common set of management tools. This will enable content servers or file servers to leverage common block services like distance replication, thin provisioning, or virtualisation of heterogeneous storage systems

9. Thin Provisioning: Thin provisioning will provide the biggest benefit in increasing the utilisation of storage by eliminating the waste of allocated but unused storage capacity. This savings is multiplied many times over by eliminating the need to copy that allocated but unused capacity every time a copy is required for backup cycles, replication, data mining, development test, data distribution, etc. The implementation of thin provisioning should be provided as a service on a storage virtualisation platform so that it can benefit existing storage systems through virtualisation. The benefits of thin provisioning would be defeated if it requires yet another stand alone storage system. This ability to increase utilisation will be embraced by Green advocate and will be seen as a way to contain costs.

10. Deduplication: Deduplication will be implemented by all the major backup vendors. Deduplication is especially effective in eliminating duplicated data in backups. The ability to reduce a stream of data by 20 to 30 times will be extremely valuable in reducing the cost of storing data to the point that it will be feasible to store backup data to disk rather than to tape where the operational, availability, and reliability characteristics are better. Other forms of deduplication like single instance store for archives, and copy on write for snapshots will become more prevalent.