Basics of Singularity

Overview

Teaching: 15 min
Exercises: 15 min
Questions
Objectives
  • Learn how to download and run images

eRNZ20 attendees only: let’s login

Assuming you have followed NeSI’s support documentation on setting passwords and configuring local SSH client software then you should login to Mahuika, e.g., using a standard terminal configuration: ssh mahuika

Get ready for the hands-on

Before we start, let us ensure we have got the required files to run the tutorials. We’ll create a directory on NeSI’s scratch “nobackup” filesystem for you to work in. NB: these files will be removed within a day of the tutorial.

$ cd ~
$ mkdir /nesi/nobackup/nesi99991/${USER}
$ cd /nesi/nobackup/nesi99991/${USER}

If it does not exist already, download the following Github repo. Then cd into it, define a couple of handy variables (see below), and finally cd into demos/02_singularity:

$ git clone https://github.com/nesi/ernz20-containers
$ cd ernz20-containers
$ export ERNZ20=$(pwd)
$ export SIFPATH=$ERNZ20/demos/sif
$ cd demos/02_singularity

We also need to initialise your environment to make Singularity available. NeSI maintains up-to-date versions of Singularity as software modules on Mahuika. Load the latest Singularity and check the version. Note that the singularity command includes extensive self-documentation::

$ module load Singularity
$ singularity version
$ singularity help

eRNZ20 attendees only: cached images

For the ERNZ20 tutorial we have prepared some of the bigger images to be downloaded in a specific directory - /nesi/nobackup/nesi99991/ernz20-containers/demos/sif/. Create the following symbolic link to be able to use them. Normally downloading the required images will take up to an hour.

$ ln -s /nesi/nobackup/nesi99991/ernz20-containers/demos/sif $SIFPATH

One more thing: much of this work will be performed interactively on our Slurm cluster, so we need to request a small allocation:

$ salloc --job-name="SingularityTutorial" --ntasks=4 --time=4:00:00 --account=nesi99991 --reservation=workshop
salloc: Granted job allocation 10179453
salloc: Waiting for resource configuration
salloc: Nodes wbn027 are ready for job

See NeSI’s support docs on Slurm Interactive Sessions for further info.

Regular users of this tutorial: read this instead

Open a second terminal in the machine where you’re running the tutorial, then run the script pull_big_images.sh to start downloading a few images that you’ll require later:

$ export ERNZ20=~/ernz20-containers
$ export SIFPATH=$ERNZ20/demos/sif
$ bash $ERNZ20/demos/pull_big_images.sh

This could take an hour or more. Meanwhile, you’ll be able to keep on going with this episode in your main terminal window.

One more thing: if you’re running this tutorial on a shared HPC system (e.g. on NeSI’s Mahuika cluster), you should use one of the compute nodes rather than the login node. You can get this setup by using an interactive scheduler allocation, for instance on Mahuika with Slurm:

$ salloc --job-name="SingularityTutorial" --ntasks=4 --time=4:00:00
salloc: Granted job allocation 3453895
salloc: Waiting for resource configuration
salloc: Nodes z052 are ready for job

See NeSI’s support docs on Slurm Interactive Sessions for further info.

Singularity: a container engine for HPC

Singularity is developed and maintained by Sylabs, and was designed from scratch as a container engine for HPC applications, and this is clearly reflected in some of its main features:

This tutorial assumes Singularity version 3.0 or higher. Version 3.3.0 or higher is recommended as it offers a smoother, more bug-free experience.

Container image formats

One of the differences between Docker and Singularity is the adopted format to store container images.

Docker adopts a layered format compliant with the Open Containers Initiative (OCI). Each build command in the recipe file results in the creation of a distinct image layer. These layers are cached during the build process, making them quite useful for development. In fact, repeated build attempts that make use of the same layers will exploit the cache, thus reducing the overall build time. On the other hand, shipping a container image is not straightforward, and requires either relying on a public registry, or compressing the image in a tar archive.

Since version 3.0, Singularity has developed the Singularity Image Format (SIF), a single file layout for container images. Among the benefits, an image is simply a very large file, and thus can be transferred and shipped as any other file. Building on this single file format, a number of features have been developed, such as image signing and verification, and (more recently) image encryption. A drawback of this approach is that during build time a progressive, incremental approach is not possible.

Interestingly, Singularity is able to download and run both types of images.

Note that Singularity versions prior to 3.0 used a slightly different image format, characterised by the extension .simg. You can still find these around in the web; newer Singularity versions are still able to run them.

Executing a simple command in a Singularity container

Running a command is done by means of singularity exec:

$ singularity exec library://library/default/ubuntu:18.04 cat /etc/os-release
INFO:    Downloading library image

NAME="Ubuntu"
VERSION="18.04 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Here is what Singularity has just done:

Container images have a name and a tag, in this case ubuntu and 18.04. The tag can be omitted, in which case Singularity will default to a tag named latest.

Using the latest tag

The practice of using the latest tag can be handy for quick typing, but is dangerous when it comes to reproducibility of your workflow, as under the hood the latest image could change over time.

The prefix library:// makes Singularity pull the image from the default registry, that is the Sylabs Cloud Library. Images in there are organised in terms of users (library in this case) and user collections (optional, default in the example above). Note that in the particular case of library/default/, this specification could be skipped, for instance:

$ singularity exec library://ubuntu:18.04 echo "Hello World"
Hello World

Here we are also experiencing image caching in action: the output has no more mention of the image being downloaded.

Executing a command in a Docker container

Now let’s try and download a Ubuntu container from the Docker Hub, i.e. the main registry for Docker containers:

$ singularity exec docker://library/ubuntu:18.04 cat /etc/os-release
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
Copying blob sha256:22e816666fd6516bccd19765947232debc14a5baf2418b2202fd67b3807b6b91
 25.45 MiB / 25.45 MiB [====================================================] 1s
Copying blob sha256:079b6d2a1e53c648abc48222c63809de745146c2ee8322a1b9e93703318290d6
 34.54 KiB / 34.54 KiB [====================================================] 0s
Copying blob sha256:11048ebae90883c19c9b20f003d5dd2f5bbf5b48556dabf06c8ea5c871c8debe
 849 B / 849 B [============================================================] 0s
Copying blob sha256:c58094023a2e61ef9388e283026c5d6a4b6ff6d10d4f626e866d38f061e79bb9
 162 B / 162 B [============================================================] 0s
Copying config sha256:6cd71496ca4e0cb2f834ca21c9b2110b258e9cdf09be47b54172ebbcf8232d3d
 2.42 KiB / 2.42 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
INFO:    Creating SIF file...
INFO:    Build complete: /data/singularity/.singularity/cache/oci-tmp/a7b8b7b33e44b123d7f997bd4d3d0a59fafc63e203d17efedf09ff3f6f516152/ubuntu_18.04.sif

NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Rather than just downloading a SIF file, now there’s more work for Singularity, as it has to both:

Note that, to point Singularity to Docker Hub, the prefix docker:// is required.

Also note how Docker Hub organises images only by users (also called repositories), not by collections.

What is the latest Ubuntu image from the Sylabs Cloud?

Write down a Singularity command that prints the OS version through the latest Ubuntu image from Sylabs Cloud Library.

Solution

$ singularity exec library://ubuntu cat /etc/os-release
[..]
NAME="Ubuntu"
VERSION="18.10 (Cosmic Cuttlefish)"
[..]

It’s version 18.10.

Open up an interactive shell

Sometimes it can be useful to open a shell inside a container, rather than to execute commands, e.g. to inspect its contents.

Achieve this by using singularity shell:

$ singularity shell library://ubuntu:18.04
Singularity ubuntu_18.04.sif:/home/ubuntu/ernz20-containers/demos/02_singularity>

Remember to type exit, or hit Ctrl-D, when you’re done!

Download and use images via SIF file names

All examples so far have identified container images using their registry name specification, e.g. library/default/ubuntu:18.04 or similar.

An alternative option to handle images is to download them to a known location, and then refer to their SIF file names.

Let’s use singularity pull to save the image to a specified path (output might differ depending on the Singularity version you use):

$ singularity pull library://ubuntu:18.04
WARNING: Container might not be trusted; run 'singularity verify ubuntu_18.04.sif' to show who signed it
INFO:    Download complete: ubuntu_18.04.sif

By default, the image is saved in the current directory:

$ ls
ubuntu_18.04.sif

Also note the trust warning from when we pulled the container image - do not ignore these! Singularity has features and specific services for signing and verifying the source and integrity of containers. Let’s check this image:

$ singularity verify ubuntu_18.04.sif
Container is signed by 1 key(s):

Verifying partition: FS:
8883491F4268F173C6E5DC49EDECE4F3F38D871E
[REMOTE]  Sylabs Admin <support@sylabs.io>
[OK]      Data integrity verified

INFO:    Container verified: ubuntu_18.04.sif

This tells us that the container was signed by Sylabs Admin and that the integrity of the image is ok, i.e., it hasn’t been changed since it was built and signed.

Now that we’ve checked the image we can use the image file simply by:

$ singularity exec ./ubuntu_18.04.sif echo "Hello World"
Hello World

You can specify the storage location with:

$ mkdir -p sif_lib
$ singularity pull --dir sif_lib docker://library/ubuntu:18.04
INFO:    Using cached image
$ ls sif_lib
ubuntu_18.04.sif

Organise your local container images

Being able to specify download locations for the container images allows you to keep your local set of images organised and tidy, by making use of a directory tree. It also allows for easy sharing of images within your team in a shared resource.

Configure cache and pull directory locations

Lots of Singularity settings can be configured by means of environment variables.

The default directory location for the image cache is $HOME/.singularity/cache. This location can be inconvenient in shared resources such as HPC centres, where often the disk quota for the home directory is limited. You can redefine the path to the cache dir by setting the variable SINGULARITY_CACHEDIR.

Similarly, if you have a preferred location to pull images into you can avoid using the flag --dir at runtime, and instead define the variable SINGULARITY_PULLFOLDER.

Reclaim cache space

If you are running out of disk space, you can inspect the cache with this command (add -v from Singularity version 3.4 on):

$ singularity cache list
NAME                     DATE CREATED           SIZE             TYPE
ubuntu_latest.sif        2019-10-21 13:19:50    28.11 MB         library
ubuntu_18.04.sif         2019-10-21 13:19:04    37.10 MB         library
ubuntu_18.04.sif         2019-10-21 13:19:40    25.89 MB         oci

There 3 containers using: 91.10 MB, 6 oci blob file(s) using 26.73 MB of space.
Total space used: 117.83 MB

and then clean it up, e.g. to wipe everything use the -a flag (use -f instead from Singularity version 3.4 on):

$ singularity cache clean -a

Contextual help on Singularity commands

Use singularity help, optionally followed by a command name, to print help information on features and options.

At the time of writing, Docker Hub hosts a much wider selection of container images than Sylabs Cloud. This includes Linux distributions, Python and R deployments, as well as a big variety of applications.

Bioinformaticians should keep in mind another container registry, Quay by Red Hat, that hosts thousands of applications in this domain of science. These mostly come out of the Biocontainers project, that aims to provide automated container builds of all of the packages made available through Bioconda.

Nvidia maintains the Nvidia GPU Cloud (NGC), hosting an increasing number of containerised applications optimised to run on GPUs.

Pull and run a Python container

How would you pull the following container image from Docker Hub, python:3-slim?

Once you’ve pulled it, enquire the Python version inside the container by running python --version.

Solution

Pull:

$ singularity pull docker://python:3-slim

Get Python version:

$ singularity exec ./python_3-slim.sif python --version
Python 3.8.0

Key Points

  • Singularity can run both Singularity and Docker container images

  • Execute commands in containers with singularity exec

  • Open a shell in a container with singularity shell

  • Download a container image in a selected location with singularity pull