OpenStack storage is probably one of the most complex topics in OpenStack architecture right after networking. There are many different storage options, at least a few storage services, and tons of supported storage backends. It is very easy to get lost.
But do not worry, there is hope. Since OpenStack was initially created as an open-source implementation of the Amazon Web Service Elastic Compute Cloud (AWS EC2), its storage architecture is quite similar to leading public clouds. This similarity makes it relatively easy to learn for someone who already has some cloud experience.
Ready to uncover OpenStack storage fundamentals? Let’s dive in!
We are going to start with an overview of the OpenStack storage options (or types). This is the topic that creates the most confusion, especially for people with a traditional data centre or VMware background.
Ephemeral storage is the primary storage option available in OpenStack. It is attached to an instance by default in the form of a file system as a part of the provisioning process. As a result, users do not need to think too deeply about storage. It is always a part of the provisioning process.
At the same time, ephemeral storage is volatile. It is deleted permanently once users terminate instances. Therefore, it should only be used for the purpose of storing temporary data. This includes common operating system (OS) files, caches, buffers, etc.
Ephemeral storage is managed by the OpenStack Nova service.
While common OS files, caches, and buffers can be lost with no cost, some data, such as database tables, cannot. For this kind of data, block storage is a better option. Block storage is persistent storage that is managed independently of instance provisioning and termination.
Block storage in OpenStack is available in the form of volumes. Users can create them, attach them to instances, and access them as block devices from within the instance. Optionally, multiple types of volumes can be created, each potentially serviced by different storage tiers. This allows users to select high-performance volumes backed by solid-state drives (SSDs) or slower, capacity-focused volumes backed by hard disk drives (HDDs) depending on their needs.
To further ensure data protection, users can then snapshot and backup their volumes, allowing for recovery in case of a data loss.
Block storage is managed by the OpenStack Cinder service.
In some use cases, the data cannot be just made available to a single instance. It must be shared across various instances instead, similar to network file system (NFS) concepts. In such cases, file storage is the recommended solution.
OpenStack’s file storage enables the creation of persistent file shares. Users can further mount them under their instances and access them as a remote file system. As with block storage volumes, file shares are created independently of instances and are not deleted by default when terminating them.
File storage is managed by the OpenStack Manila service.
The last storage option available to OpenStack users — object storage — is totally different from any other option. Object storage is persistent cloud-native storage that provides built-in replication mechanisms for data durability and geo-redundancy.
Contrary to other storage types, object storage is not attached to instances at all. It is accessible through an application programming interface (API) instead. Its structure is flat, with no directory hierarchy, and every data chunk is treated as an individual object.
Object storage is managed by the OpenStack Swift service, however, most production OpenStack environments leverage Ceph Object Gateway implementation instead, because of its stability and better performance.
All right! So we now know how OpenStack handles different types of storage use cases. But where is the data actually stored? This is where storage backends come in. Each OpenStack service can wrap around various solutions. There are lots of them supported by the upstream community. The following section provides an overview of the most popular.
Ceph is undoubtedly the most popular storage backend for OpenStack. 70% of production OpenStack clouds use Ceph for data storage purposes. Ceph is also the industry’s default software-defined storage (SDS) solution used by other infrastructure components, such as Kubernetes. It provides a built-in replication mechanism, self-healing capabilities, and can be used for block storage, file storage, and object storage.
While Ceph helps to unlock all the benefits of cloud-native storage, over the last few years some organisations have made significant investments in other storage solutions. Large storage arrays full of disks are still common in data centres. So, for those types of organisations, the internet small computer system interface (iSCSI) and vendor-contributed drivers are the answer. By using their existing storage systems as a backend they can easily migrate to OpenStack from legacy infrastructure without modernising their storage.
Although not suitable for production, local storage is fully sufficient for testing and development. It is also a reasonable option for edge deployments if the size of the site does not allow for a more advanced storage backend. Local storage in OpenStack usually leverages a Logical Volume Manager (LVM) driver, using LVM volumes as data stores.
There are many other backends available out there and with every release, this list is constantly growing. The use of particular technologies depends on the specific use case and should always be preceded by appropriate analysis. Among other storage backends, the most notable are NetApp and PureStorage.
Now that you’ve got an overview of OpenStack storage concepts, you might be wondering where to navigate from here to find more information about this topic.
Here are some useful links for you:
May the capacity be with you!
One of the most critical gaps in traditional Large Language Models (LLMs) is that they…
Canonical is continuously hiring new talent. Being a remote- first company, Canonical’s new joiners receive…
What is patching automation? With increasing numbers of vulnerabilities, there is a growing risk of…
Wouldn’t it be wonderful to wake up one day with a desire to explore AI…
Ubuntu and Ubuntu Pro supports Microsoft’s Azure Cobalt 100 Virtual Machines (VMs), powered by their…
Welcome to the Ubuntu Weekly Newsletter, Issue 870 for the week of December 8 –…