Object storage has by far the most simplistic interface out there, with no need for complicated SCSI drivers, HBA drivers, multipathing tools, or volume managers embedded into your Operating System. All you need to do is point your application at an HTTP endpoint, and use a simple set of verbs to describe what you want to do with a piece of data.
Do you want to PUT it somewhere for safekeeping? Do you want to GET it so that you can do some work with that piece of data? Or do you want to LIST the contents of your bucket?
Perhaps these three verbs are an oversimplification of what is possible with object storage, but this is loosely where cloud object storage began. It was an initiative to make storage more economical by removing proprietary technologies and creating a simple scalable storage solution, without the complexities of legacy technologies.
Uses of Object Storage
Firstly, when building a new application, you will need to build it with object storage in mind. Instead of relying on cluster-aware filesystems and quorum devices, the application will need to handle failover and data consistency itself to remain available during hardware failures. Alternatively, many off the shelf applications now have native deployment models for working with cloud native infrastructure, and most importantly with object storage. When your application has finished processing or creating a piece of data, it can be written to an object store for safekeeping, and can easily be retrieved as and when needed.
We can even use object storage buckets to trigger events. Imagine the scenario where you have a mobile app that uploads photos or video, and then some processing happens, before publication. Once a photo or video is uploaded to an object store, an event is triggered to let your application know that there is a new object to be processed. And once that object has been processed the output could be written to a bucket that triggers another job to push it to your Content Distribution Network (CDN).
Where can I get Object Storage?
There are lots of options available, all public clouds have object storage offerings. Some of the most well-known are Azure Blob Storage, GCP Cloud Storage, and Amazon AWS S3. Each of these offerings has its own APIs, however, the most commonly used is AWS S3’s API.
The S3 API has been implemented in other storage solutions, such as Ceph and to a certain extent OpenStack Swift. However, Swift’s implementation is not as feature-complete as Ceph’s and is lacking some features around object lifecycle management and notifications.
Of course the major storage vendors such as Dell EMC and NetApp, also have solutions, which have largely standardised on the S3 API, yet when compared with open source solutions these remain cumbersome and expensive.
Is private object storage a solution for you?
The public cloud might not always be the right choice for all workloads, or for storing all of your data. Despite the fact that the public cloud is instantly accessible, which makes it a great way to get started, over time and as your data set grows, it can become rather cost-inefficient. Typically cloud provider costs not only include the charges for storing data, but also retrieval too. Actually, some providers charge for the number of API operations that you request, and for the network transfer costs on top!
A privately hosted solution can provide significant savings when you have predictable capacity requirements, and you can manage your own transit costs, either into a public cloud, via products like Direct Connect or ExpressRoute, or at no cost in your own DC or Colo.
A Ceph cluster that is compatible with both the AWS S3 API and the OpenStack Swift API can be a cost-effective way to provide object storage to your applications, by combining open-source software with commodity hardware. If you would like to find out more about that option, visit our Ceph here or get in touch!
Discover more from Ubuntu-Server.com
Subscribe to get the latest posts sent to your email.