For service providers, the rise of cloud computing is both a threat and an opportunity. Shared and dedicated hosting services are all under attack by a variety of emerging cloud services that provide hourly pricing and automation through APIs. An emerging class of competitors, led by Amazon Web Services (AWS), is not just building a new market but siphoning spending from established providers. Those who don’t respond quickly face the possibility of declining revenues or even obsolescence. The good news is that cloud services also represent a large opportunity for serving new customers and increasing revenue. However, in order to compete, existing service providers must move quickly to deliver cloud services of their own. The race to offer cloud services is already well under way. During the first half of 2012, Neovise completed research on 500 of the largest hosting providers in the world. The study revealed that 229 of them, or 45.8%, already offer cloud-based infrastructure-as-a-service (IaaS). While service providers are flocking to IaaS, storage-as-a-service (STaaS) is another huge segment of the cloud computing market that should not be overlooked. The demand for STaaS is increasing exponentially, driven by trends like video, mobility, big data, file sharing and cloud computing itself. In order to serve each of these growing needs, several different types of cloud storage services are coming online. Object storage is perhaps the most important of these new storage services since it provides the low cost, persistent, reliable data storage foundation underlying many STaaS and IaaS offerings.
“Object storage not only represents a huge revenue opportunity, it is one of the fastest growing segments of the cloud computing market.”
Object storage not only represents a huge revenue opportunity, it is one of the fastest growing segments of the cloud computing market. The most successful object storage service to date is Amazon Simple Storage Service (S3). In June 2012, just six years after
its launch, the number of objects stored in S3 reached one trillion (1,000,000,000,000 or 1012). While the typical object storage service will not achieve these rates, the never-ending growth in storage demand provides a viable environment in which to compete.
Block storage. Block storage devices provide fixed-size chunks of raw storage capacity. At this level, data is stored without any concept of data format or type. The data is simply a series of 0s and 1s and it is up to higher-level applications and/or file systems to keep track of data location, context and meaning. The storage area network (SAN) is a commonly used example of a blockbased storage system. They are dedicated networks that provide access to consolidated, blocklevel data storage devices such as disk arrays. SANs use the iSCSI standard to send and receive data over IP networks and give applications and file systems the illusion of locally attached disks. File storage. Network attached storage (NAS) is a common example of a file-based storage system. NAS devices use file systems built on top of block storage devices. File systems serve two key purposes: they establish the notion of a file and they provide a structure for organizing files. 1. Files are data structures that typically live within operating systems and keep track of a related set of blocks that contain the contents of the file. Files also have associated information known as metadata that describes the file. File name, length, type and creation date are all examples of metadata. 2. File organization is also accomplished using data structures that are typically part of operating systems. File systems offer directories (also called folders on some systems) that are used to store related files and/or additional directories. Directories are organized using a hierarchical naming convention starting with a root or starting point. Within a given file system, every directory is uniquely identified using its full path name starting from the root directory. Files are uniquely identified by the full path name of the directory where they are stored, followed by their name.
Understanding Object Storage
Many new technologies are easily compared to their predecessors, requiring just a small intuitive leap to understand them. On the other hand, some new technologies bare only a slight resemblance to past approaches. Object storage fits in this latter category and introduces a completely new way of looking at storage. Rather than describing how object storage is similar to block or file-based storage, it is better to describe how object storage is different.
“Object storage introduces a completely new way of looking at storage.”
While object storage systems may contain files and other types of data, they are not actually file systems and do not establish the notion of files. Just as the name implies, object storage systems contain objects. These objects consist of an object identifier (OID), data and metadata. There is no object organization system equivalent to the hierarchical directories found in file systems. The object storage system does not impose any structure on objects as file systems do for different file types. Even when object storage systems contain files, the object system simply views them as data. Unlike file systems, the metadata for individual objects can be quite extensive. Why Object Storage is Better At first glance, object storage may seem to have a number of shortcomings. However, these systems have instead eliminated the capabilities that create performance bottlenecks and scalability limitations in file systems. For example, many operations on individual files require the file system to traverse every directory in the file’s path name, starting with the root directory. These data structures are shared by the entire file system and access to them can slow dramatically when hundreds of thousands or even millions of files are stored. Accessing metadata that is stored in the file system but outside individual files also serves as a barrier to scale and performance. Object storage works much differently than file systems, in part by providing direct access to individual objects. All that is needed to access an object is an OID. Since all metadata for each object is stored within the object itself, there is no contention for shared resources such as those found in file systems (e.g. shared namespace, certain metadata and other file system data structures). This approach allows operations
“The result is nearly unlimited scale along with linear performance.”
performed on objects to happen independently of operations on other objects. The result is nearly unlimited scale along with linear performance.
Of course object storage is not the best approach for every storage application. Databases, file systems and applications that require the lowest latencies tend to work best with block level storage systems such as SANs. On the other hand, when chosen for the right usage scenario, object storage systems are hard to beat. Static data such as virtual machine images, videos, photos, email, backups and archives are particularly well suite for object storage.
Object Storage Requirements
Service providers interested in offering object-based STaaS must address two related sets of requirements: First, they must meet their own business requirements. Second, they must meet the technical requirements of their customers.
Before discussing business requirements, it is important to note that service providers may either build or buy object storage systems. Building – including extending an open source solution – has a number of inherent challenges including the need for deep expertise in software development, distributed systems architecture and design, object storage, security and more. Then, once the object storage service is operational, additional resources will still be needed for ongoing support and enhancements. Advantages provided by commercial solutions may include faster time to market and revenue, a company to stand behind the product and provide support, and extended capabilities such as integration with billing and payment systems. Several of the top business requirements for a commercial object storage system are summarized here: • • • • • Speed – The race is on to get to market with complete cloud solutions, including object storage. Service providers need complete solutions that support rapid time-to-market. Scale – While economies of scale provide an important advantage, many service providers need a solution that scales down, letting them start small and grow quickly. Revenue – In order to target the broadest set of potential customers, an object storage solution must meet the requirements of a variety of workloads. Cost – Object storage services are price competitive and service providers need affordable solutions that work on commodity hardware yet deliver high reliability. Differentiation – Service providers need an object storage solution that supports multiple architectures and that can be tuned across variables such as cost and performance.
As suggested by several of the business requirements, it is also critical to deliver an object storage service that meets the technical requirements of customers.
The items listed below represent some of the most important requirements categories for success with persistent, multi-tenant object storage services. Any purchase decision should consider
these and other in-depth technical requirements, as well as hands-on validation through proof of concept projects and/or limited trials. • • • • • Reliability – Resiliency to underlying hardware and network failures in availability zones, data centers and regions or other similar failure-containment constructs. Security – Isolation of data from other tenants, access control lists for users and groups, read/write access control, encryption and strong authentication. Scale – Huge numbers of objects are stored in object storage services so they must scale to billions of objects or more, while supporting large object sizes. Performance – Fast reads and writes while handling a broad range of transaction rates and traffic conditions and while serving tenants in availability zones in every region. Control – Control over service level agreements (SLA) so that performance, reliability and costs tradeoffs can be made. Customer control over data location.
There is an additional requirement to consider relative to an existing object storage service, Amazon Simple Storage Service (S3).
Amazon S3 Compatibility
S3 is the object storage service for AWS, providing its customers a highly durable storage infrastructure for a wide variety of applications. S3 objects come from a variety of sources including Amazon Elastic Compute Cloud (EC2) customers seeking persistent storage and partners that have built services on top of S3. Due to the overall success of S3, it serves as an industry benchmark for multi-tenant object storage services. Those wishing to compete in the IaaS space in general or the object storage space in particular should seriously consider building an object storage service compatible with the S3 API. By providing this level of S3 compatibility, service providers move one step closer to tapping into the ecosystem of solutions that run on S3.
“Those wishing to compete in the IaaS space in general or the object storage space in particular should seriously consider building an S3compatible object storage service.”
revenue are so critical to service providers, Cloudian was also designed for deployment from start to finish in just a few months. Service providers are following several different approaches to adding object storage services to their offerings. Some start by offering integrated compute and object storage services. Others start with standalone object storage and either add compute services later, or remain a cloud storage specialist. Regardless of which path is chosen, the majority of service providers like to start small – with perhaps a 100TB system – and then grow large quickly. Cloudian supports this model by scaling from a minimum of two nodes up to hundreds of nodes with petabytes of data across multiple data centers. The company also offers the Cloudian Community Edition for free in deployments up to 100TB. This is an excellent way for service providers as well as enterprises to get accustomed to object storage. Another important aspect of scaling is affordability, which
“Cloudian enables service providers to leverage heterogeneous commodity servers for cost-effective horizontal scalability.”
also impacts profit margins. Cloudian enables service providers to leverage heterogeneous commodity servers for cost-effective horizontal scalability. Since heterogeneous systems are supported, new systems can be mixed with old when it is time for growth. New node detection and data re-
balancing is performed automatically without service interruption. For reliability, Cloudian is built on top of a NoSQL storage layer and has a fully distributed, peer-to-peer architecture, with no single point of failure. It is resilient to network and node failures with no data loss due to the automatic replication and recovery processes inherent to the architecture. The system can be deployed across multiple sites and datacenters to provide geographic redundancy. Upgrades and updates can be performed without service interruption.
customers. Business requirements include speed to market, the ability to start small and grow large quickly, revenue and addressable market through offering services with broad appeal, affordability through support of commodity hardware, and differentiation by supporting flexible architectures and other configurable choices. Technical requirements include extremely high reliability and security, massive scale for individual tenants and across the entire service, solid performance in a variety of usage conditions, and control by supporting flexible choices and tradeoffs between SLA variables such as performance, reliability and cost. Very few service providers have the resources and skills to build their own object storage service. While open source offerings are a step in the right direction, they still require deep expertise in a variety of technical domains. Even after deployment they consume valuable resources for ongoing support and enhancements. Advantages provided by commercial solutions may include faster time to market and revenue, a company to stand behind the product and provide support, and extended capabilities such as integration with billing and payment systems. Cloudian helps service providers quickly and affordably build highly reliable, massively scalable object storage services that are API-compatible with Amazon S3. Neovise recommends serious consideration of Cloudian as the basis for large-scale, object-based storage services, particularly for those
“Cloudian helps service providers quickly and affordably build highly reliable, massively scalable object storage services that are APIcompatible with Amazon S3.”
seeking S3 compatibility. To determine whether Cloudian is right for you, consider doing a proof of concept or in-depth trial. Also keep in mind that the Cloudian Community Edition is license free up to 100TBs – a great way to get started with object storage.
About Neovise Based on independent research and analysis, Neovise delivers essential knowledge and guidance to cloud-related technology vendors, service providers and systems integrators, as well as business and IT organizations that purchase and use cloud-related services and technology. Our offerings include research, advisory and collateral development services that help our customers— and their customers—make optimal decisions and formulate winning strategies. Research. Analyze. Neovise. For more information, visit www.neovise.com.