The flexibility of public cloud infrastructure allows for little to no upfront expense, and is great when starting a venture or testing an idea. But once a dataset grows and becomes predictable, it can become a significant base cost, compounded further by additional costs depending on how you are consuming that data.
Public clouds were initially popularised under the premise that workloads are dynamic, and that you could easily match available compute resources to the peaks and troughs in your consumption, rather than having to maintain mostly idle buffer capacity to meet peak user demands. Essentially shifting sunk capital into variable operational expense.
However, what has become more apparent is that this isn’t necessarily true when it comes to public cloud storage. Typically what is observed in a production environment is a continual growth of all data sets. Those that are actively used for decision making or transactional processing in databases, tend to age out but need to be retained for audit and accountability purposes. Training data for AI/ML workloads grow and allow models to be more refined and accurate over time. Content and media repositories grow daily, and exponentially with the use of higher quality recording equipment.
Typically there are three areas where costs are incurred.
If in the future you decide to move your data to another public cloud provider, you would incur these costs during migration too!
Imagine you have a dataset that’s 5PB and you want to understand its total cost of ownership (TCO) over 5 years. First we need to make some assumptions about the dataset and how frequently it will be accessed.
Over the lifetime of the dataset we will assume that it will be written to twice, so 10PBs of written data. We will also assume that it will be read 10 times, and each object is an average of 10MB.
In a popular public cloud, object storage capacity starts at $0.023/GB, and as usage increases the price decreases to $0.021/GB. You are also charged for the transactions to store and retrieve the data. These costs sound low, but as you start to scale up, and then consider the multi-year cost they can quickly rise to significant numbers.
For the 5PB example, the TCO over 5 years is over $7,000,000, and that’s before you even consider any charges for compute to interact with the data, or egress charges to access the dataset from outside of the cloud provider’s infrastructure.
Is there another way to tackle these mounting storage costs, yet also retain the flexibility of deploying workloads in the cloud?
IT infrastructure is increasingly flexible, so with some planning it is possible to operate an open-source storage infrastructure based on Charmed Ceph that is fully managed by experts adjacent to a public cloud region and connected to the public cloud via private links to ensure the highest availability and reliability. Using the same assumptions around usage as before, a private storage solution can reduce your storage costs by more than 2-3x over a 3-5 year period.
Having your data stored using open-source Charmed Ceph in a neutral location, yet near to multiple public cloud providers unlocks a new level of multi-cloud flexibility. For example, should one provider start offering a specific compute service that is not available elsewhere, you can make your data accessible to that provider without incurring significant access or migration costs. As you would when accessing one provider’s storage from another provider’s compute offering.
Additionally, you can securely expose your storage system to your users via your own internet connectivity, without incurring public cloud bandwidth fees.
Later this quarter we will publish a detailed whitepaper with a breakdown of all the costs of both of these solutions alongside a blueprint of the hardware and software used. Make sure to sign up for our newsletter using the form on the right hand side of this page (cloud and server category) to be notified when it is released.
In this article, we will see how to Install Google Cloud BigQuery Python client library…
Nov 15,2024 Wallpaper Contest for Xfce 4.20 open for voting The submission phase for the…
MicroCloud 2.1.0 LTS is now available, expanding the number of Canonical infrastructure solutions with a…
Canonical is thrilled to be joining forces with Dell Technologies at the upcoming Dell Technologies…
In today’s massive private mobile network (PMN) market, one of the most common approaches to…
Welcome to the Ubuntu Weekly Newsletter, Issue 865 for the week of November 3 –…