Integrating the ubuntu snapshot service into systems management and update tools

Integrating the Ubuntu Snapshot Service into systems management and update tools

In an earlier blog we announced our new snapshot service and how Microsoft Azure was using this for updates. We recently published information to help people use the service beyond Microsoft Azure.

Today I would like to show a simplified example of how to integrate the snapshot service into systems management tooling. This can let users control the rollout of updates across an Ubuntu fleet and improve the update experience for users. The aim is to inspire those building systems management tools, particularly multi-OS tools, to investigate the Ubuntu snapshot service.

If you are an end user or enterprise wanting to use snapshots, please take a look at Landscape.

Sponsored
This is Canonical’s recommended systems management tool and it has recently added support for the snapshot service.

A video version of this blog post is available here:

Video overview of integrating the Ubuntu snapshot service into systems management tools

Usage of the snapshot service

As a simple example, on an out-of-the-box Ubuntu 24.04 (Noble) system, you can pass a snapshot argument to different apt commands and apt will act as if you had run those commands at that date and time (any time after 1 March 2023):

apt update --snapshot 20240423T230000Z
apt policy hello -S 20240423T230000Z
apt install hello --snapshot 20240423T230000Z

These commands should also work for Ubuntu on our public cloud partners (AWS, Azure, GCP, Oracle or IBM Cloud).

You do need to ensure that the index is up to date before running those other apt commands (note the above apt update with the --snapshot argument before the other commands). There is a new apt command that makes this a little easier, letting you update your indexes and then install in one command:

apt remove hello
apt install hello --update --snapshot 20240423T230000Z

For more detailed guidance of how to use the service, please see the documentation. Snapshots also work on earlier Ubuntu releases with additional configuration. There are a number of useful applications of this, including reproducibility, debugging customer issues and similar.

Integrating the snapshot service into update management

We wrote an earlier blog series on how to balance security and stability in Ubuntu updates. In Part 3 of that series, we talked about different ways to create a point-in-time “snapshot” of updates. This can be a useful technique when updating multiple instances. Ubuntu takes a number of steps to reduce the risk of regressions in security updates. There is always a chance, however, that an update will have a negative consequence in your specific environment.

Using snapshots allows a consistent set of updates that you can test and move progressively through your production environments. This could limit the “blast radius” of any negative effects of an update —perhaps to one instance in your highly-available set instead of all three. Any progressive rollout of security updates does mean that machines at the end of the rollout are unpatched until it has completed (as covered further below).

Demonstrating a snapshot rollout with example instances

We can demonstrate how a systems management or update tool could integrate the snapshot service by manually running commands. We can do this across four instances to simulate different risk levels, availability or update domains, regions or similar. Each of these demo instances represents an arbitrarily large set of instances in that update domain or risk level.

I will use freshly-launched 24.04 instances. Snapshots work on earlier Ubuntu releases, but require additional steps (detailed in the documentation). We regularly publish new versions of our images on the public clouds that incorporate security fixes. That means that if I used an up-to-date image it would not have many security updates. We will therefore specify an old build of the image, so that we have more updates available.

We can see when the image was created in the build.info:

$ cat /etc/cloud/build.info
build_name: server
serial: 20240423

The “serial” shows us that this image was created on 23 April 2024.

Preventing uncontrolled updates

By default, we set Ubuntu instances to update themselves once a day with security updates using unattended-upgrades. We can prevent these systems automatically updating themselves by setting a snapshot that matches the image build date. In this case, we can set all of the systems to use the snapshot of the archive as it was on 23 April 2024 at 23:00 UTC. This means none of the systems should see any updates released after that date and time.

echo 'APT::Snapshot "20240423T230000Z";' | sudo tee /etc/apt/apt.conf.d/50snapshot

Now, any normal apt commands should not install anything new, because the instances see the archive as it was on 23 April 2024.

$ sudo apt update
[...]
All packages are up to date.
$ sudo apt upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

unattended-upgrades also will not install anything because, again, it sees the archive as it was on 23 April:

$ sudo unattended-upgrade -d
[...]
pkgs that look like they should be upgraded:
Fetched 0 B in 0s (0 B/s)
                          	 
fetch.run() result: 0
Packages blacklist due to conffile prompts: []
No packages found that can be upgraded unattended and no pending auto-removals

Creating an update set

Now we want to start rolling updates out across the estate. We start by choosing a snapshot ID, which will likely be the date and time that we start the process. We will use 20240617T120000Z, or 12:00 UTC on 17 June 2024. Our first “update set” is then the updates between the image build date and this 17 June 2024 snapshot.

We are going to start with this first instance, which could be our development or staging environment. First we run the command that we ran before, but with the 17 June snapshot ID:

echo 'APT::Snapshot "20240617T120000Z";' | sudo tee /etc/apt/apt.conf.d/50snapshot

Now we can update our package indexes and the instance does see available updates:

$ sudo apt update
[...]
80 packages can be upgraded. Run 'apt list --upgradable' to see them.

After setting the snapshot, we can use all the normal apt commands to upgrade our packages to that snapshot, or let unattended-upgrades run and it would upgrade just the security updates (or whatever else we have set up for unattended-upgrades to do).

The other instances will not install any of these updates, because we set them to use the old snapshot. That means that we can test the upgraded instance(s) and check that everything seems to be working correctly. If there are issues, we can pause the rollout and address them before rolling the update out more widely.

Rolling out our update set

Assuming that everything appears fine in our dev/test environment, we can roll the updates to the next tier. We use the process above to set the next group to the same snapshot that we just validated. Perhaps this is internal servers or just instances in one availability zone of highly-available configurations.

See also  Ubuntu Summit Memories Live On

Maybe we then let the updates run with production workloads for a day. Then we can move to the next group and repeat the process, again setting the snapshot to the same ID. And so on, through each of the rings or risk levels in our deployment plan.

By this point of the rollout it may be a week later, but we will still use the snapshot from midday 17 June. This is because we validated that update set in testing and in the earlier deployment rings. If we find any issues when rolling out the updates, we can pause the rollout. This gives us time to address the issues (always bearing in mind the security implications of delaying updates).

These rollouts could overlap, too. If the Ubuntu security team releases a new update, we can kick off a new update. We can create a new update set based on a snapshot ID that includes that update. Then we can validate that in testing and early tiers while the previous update set rolls out.

Balancing security and stability

As we mentioned in some of our previous content, you should keep the update rollout as short as possible. Your estate will have unprotected instances for however long you are rolling out security updates. By default, we set Ubuntu server instances to install security updates each day. You need to balance any extension of this against the increased security risk. One approach is to check the severity of any unpatched vulnerabilities, for example in our security feeds. Then users could accelerate rollouts to address high-severity vulnerabilities. If you are creating an interface for end users, exposing this information lets them make more informed choices.

Different organisations need to balance stability and security in different ways. Snapshots are another way that we give Ubuntu users more options in how they strike that balance. By deploying update sets progressively across your risk levels, you can balance security against any potential disruption.

Talk to us!

Please let us know how you are using or integrating the snapshot service in our Public Cloud discourse. We would love to hear from you!


Discover more from Ubuntu-Server.com

Subscribe to get the latest posts sent to your email.

Comments

No comments yet. Why don’t you start the discussion?

    Leave a Reply