In Spring Thoughts Turn to Software-Defined Storage

This past Friday, May 31 marks the date when three datacenter players all simultaneously and coincidentally drew separate lines in the sand in terms of defining the future of enterprise storage.

Hitachi Data Systems (HDS), Nutanix and NetApp each published a very different perspective on Software-defined Storage (SDS). Hitachi CTO, Hu Yoshida, makes the case in Software Defined Storage is not about Commodity Storage, that intelligence belongs in proprietary arrays, rather than in software. Nutanix CEO, Dheeraj Pandey, takes the exact opposite position in Software-defined Storage: Our Take. And NetApp via Virtualization Solutions Architect, Nick Howell, claims that it has been selling SDS for years in a personal post titled OK, Sure, We’ll Call it ‘Software-Defined Storage’.

The debate on SDS is not just one of semantics. As Pandey points out in his article, SDS is an essential component of a software-defined datacenter (SDDC) which in turn is at the heart of private cloud. Customers purchasing a manufacturer’s vision of SDS are setting the course for their own future SDDC and private cloud initiatives.


Datacenter Storage Evolution

EMC introduced the Symmetrix in 1990, and larger organizations increasingly started moving their data from local drives to central arrays in order to benefit from shared access, enterprise management and upgradability. By the time Google debuted in 1998, Yahoo was the incumbent search market leader utilizing, as did Alta Vista, Ebay and the other large Internet firms of the period, proprietary storage arrays for the bulk of its business. Yahoo was even featured in NetApp co-founder Dave Hick’s book, How to Castrate a Bull.

Google was confident in the superiority of its search algorithm, and anticipated billions of users searching trillions of objects. It knew that Yahoo’s shared storage model simply wouldn’t scale to handle this type of volume let alone the expense and complexity it entailed.

Google recognized that a SAN utilizes the same basic Intel components as a server. Rather than placing storage into a proprietary and expensive SAN or NAS, the company aggregated the local drives of custom-built simple servers. The company hired a handful of scientists from prestigious universities to build the Google File System (GFS) in order to achieve massive parallel computing with exceptional fault tolerance. Google also invented the MapReduce and No SQL technologies to enable linear scalability without any performance degradation. This model eliminated network traffic between the compute and storage tiers and was much simpler to manage.

Google’s converged infrastructure helped it rocket to quickly become the dominant search engine player. Robin Harris of StorageMojo estimated that Yahoo spent 3 to 10 times more on storage than Google. He said that for Yahoo, it was like bringing a knife to a gun fight.

In keeping with its philosophy of an open systems approach, Google published a paper on the GFS in 2003. This eventually led to adoption of a similar architecture by Amazon with DynamoDB, by Facebook with Haystack and by Yahoo with Hadoop. Twitter, Salesforce, eBay and even Microsoft Azure all now also utilize scale-out local storage infrastructures rather than SANs for their primary businesses.

But even as the Internet leaders embraced the Google scale-out datacenter model, commercial and government enterprises of all sizes were going in the opposite direction by purchasing arrays in order to take advantage of the VMware capabilities of vMotion, High Availability and Fault Tolerance. Ironically, SANs were never built with virtualization in mind. They were meant for a one-to-one relationship between LUN and physical server rather than for many different workloads on a single LUN.


A Different Approach to the Datacenter 

A couple of the developers of GFS saw an opportunity to bring the advantages of true convergence to commercial and government enterprises by leveraging the hypervisor itself. They, along with engineers from companies such as Oracle, VMware, Microsoft and Facebook, spent three years developing the Nutanix Distributed File System (NDFS) which is at the heart of the Nutanix Virtual Computing Platform (VCP).

Nutanix, like Google, believes that a datacenter architecture utilizing separate tiers of servers and storage arrays is fundamentally flawed. A data center should be virtualized with the infrastructure intelligence residing in the software rather than in proprietary equipment. Commodity hardware is an essential SDS component because it can quickly be upgraded as the CPU, flash and HDD industry technology advances. This model is significantly less expensive to implement, is more resilient and vastly more scalable. It is also much simpler to manage.

SDS, as defined by Pandey, also serves as the underpinning of a SDDC. Abstracting all datacenter components from the underlying physical resources provides more flexibility and versatility. Specific networking, storage and compute equipment are no longer required.

NetApp and HDS, with the majority of their revenues stemming from proprietary arrays, have no choice but to defend the datacenter status quo. This is why Hitachi’s Yoshida insists that intelligent hardware is necessary to enable software-defined storage.

While Pandey readily admits that Nutanix is just scratching at the surface of SDS, he presents a compelling case that there are seven principles that constitute the “true north” of SDS —all of which Nutanix embraces. This approach to SDS enables actual convergence of the compute and storage tiers. Anything else is just adjacency.


Contribution: Thanks to Lane Leverett, VCDX, for his edits to this article.


See Also:

Ex-Google Man Sells Search Genius to Rest of World.  12/21/2011. Cade Metz. Wired.

The Battle for Convergence. 12/12/2012. Stuart Miniman. Wikibon Blog.

The Efficient Cloud: All of Salesforce runs on only 1,000 servers. 03/23/2009. Erik Schofeld. Techcrunch.

How Yahoo Can Beat Google. 07/05/2007. Robin Harris. StorageMojo.

The Google File System.  Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. Google Research Publications.

How to Castrate a Bull. 2009. Dave Hitz. Book by NetApp Co-Founder.