Docker

Introduction to Docker
Docker hub
- Running existing images
- Manipulate image
Docker recipes

Introduction to Docker

Platform for developing, shipping and running applications
Infrastructure as application / code
First version: 2013
Company. Original dotCloud (2010), later named Docker
Established Open Container Initiative

As a software:

Docker Community Edition
Docker Enterprise Edition

Docker components

Read-only templates
Containers are run from them
Images are not run
Images have several layers

Images versus containers

Image: set of layers. read-only templates. Inert. An instance of an image is called a container. When you start an image, you have a running container of this image. You can have many running containers of the same image.

The image is the recipe, the container is the cake.

https://stackoverflow.com/questions/23735149/what-is-the-difference-between-a-docker-image-and-a-container

Docker vocabulary

docker

Get help:

docker run --help

Using existing images

Explore Docker hub

Images can be stored locally or shared in a registry.
Docker hub is the main public registry for Docker images.

Let’s search the keyword ubuntu:

docker pull: import image

get latest image / latest release

docker pull ubuntu

choose the version of Ubuntu you are fetching: check the different tags

docker pull ubuntu:18.04

Biocontainers

https://biocontainers.pro/

Specific directory of Bioinformatics related entries

Entries in Docker hub and/or Quay.io (RedHat registry)
Normally created from Bioconda

Example: FastQC

https://biocontainers.pro/#/tools/fastqc

docker pull biocontainers/fastqc:v0.11.9_cv7

docker images: list images

docker images

Each image has a unique IMAGE ID.

docker run: run image, i.e. start a container

Now we want to use what is inside the image.
docker run creates a fresh container (active instance of the image) from a Docker (static) image, and runs it.

The format is:
docker run image:tag command

docker run ubuntu:18.04 /bin/ls

Now execute ls in your current working directory: is the result the same?

You can execute any program/command that is stored inside the image:

docker run ubuntu:18.04 /bin/whoami
docker run ubuntu:18.04 cat /etc/issue

You can either execute programs in the image from the command line (see above) or execute a container interactively, i.e. “enter” the container.

docker run -it ubuntu:18.04 /bin/bash

Run container as daemon (in background)

docker run --detach ubuntu:18.04 tail -f /dev/null

Run container as daemon (in background) with a given name

docker run --detach --name myubuntu ubuntu:18.04 tail -f /dev/null

docker ps: check containers status

List running containers:

docker ps

List all containers (whether they are running or not):

docker ps -a

Each container has a unique ID.

docker exec: execute process in running container

docker exec myubuntu uname -a

Interactively

docker exec -it myubuntu /bin/bash

docker stop, start, restart: actions on container

Stop a running container:

docker stop myubuntu

docker ps -a

Start a stopped container (does NOT create a new one):

docker start myubuntu

docker ps -a

Restart a running container:

docker restart myubuntu

docker ps -a

Run with restart enabled

docker run --restart=unless-stopped --detach --name myubuntu2 ubuntu:18.04 tail -f /dev/null

Restart policies: no (default), always, on-failure, unless-stopped

Update restart policy

docker update --restart unless-stopped myubuntu

docker rm, docker rmi: clean up!

docker rm myubuntu
docker rm -f myubuntu

docker rmi ubuntu:18.04

Major clean

Check used space

docker system df

Remove unused containers (and others) - DO WITH CARE

docker system prune

Remove ALL non-running containers, images, etc. - DO WITH MUCH MORE CARE!!!

docker system prune -a

Reference: https://www.digitalocean.com/community/tutorials/how-to-remove-docker-images-containers-and-volumes

Volumes

Docker containers are fully isolated. It is necessary to mount volumes in order to handle input/output files.

Syntax: –volume/-v host:container

mkdir datatest
touch datatest/test
docker run --detach --volume $(pwd)/datatest:/scratch --name fastqc_container biocontainers/fastqc:v0.11.9_cv7 tail -f /dev/null
docker exec -ti fastqc_container /bin/bash
> ls -l /scratch
> exit

Exercises:
1. Copy the 2 fastq files from available datasets in Github repository and place them in mounted directory
2. Run fastqc interactively (inside container): fastqc /scratch/*.gz
3. Run fastqc outside the container

Ports

The same as with volumes, but with ports, to access Internet services.

Syntax: –publish/-p host:container

docker run --detach --name webserver nginx
curl localhost:80
docker exec webserver curl localhost:80
docker rm -f webserver

docker run --detach --name webserver --publish 80:80 nginx
curl localhost:80
docker rm -f webserver

docker run --detach --name webserver -p 8080:80 nginx
curl localhost:80
curl localhost:8080
docker exec webserver curl localhost:80
docker exec webserver curl localhost:8080
docker rm -f webserver

Docker recipes: build your own images

Building recipes

All commands should be saved in a text file, named by default Dockerfile.

Basic instructions

Each row in the recipe corresponds to a layer of the final image.

FROM: parent image. Typically, an operating system. The base layer.

FROM ubuntu:18.04

RUN: the command to execute inside the image filesystem.
Think about it this way: every RUN line is essentially what you would run to install programs on a freshly installed Ubuntu OS.

RUN apt install wget

A basic recipe:

FROM ubuntu:18.04

RUN apt update && apt -y upgrade
RUN apt install -y wget

More instructions

MAINTAINER

Who is maintaining the container?

MAINTAINER Toni Hermoso Pulido <toni.hermoso@crg.eu>

WORKDIR: all subsequent actions will be executed in that working directory

WORKDIR ~

ADD, COPY: add files to the image filesystem

Difference between ADD and COPY explained here and here

COPY: lets you copy a local file or directory from your host (the machine from which you are building the image)

ADD: same, but ADD works also for URLs, and for .tar archives that will be automatically extracted upon being copied.

# COPY source destination
COPY ~/.bashrc .

ENV, ARG: run and build environment variables

Difference between ARG and ENV explained here.

ARG values: available only while the image is built.
ENV values: available for the future running containers.

CMD, ENTRYPOINT: command to execute when generated container starts

The ENTRYPOINT specifies a command that will always be executed when the container starts. The CMD specifies arguments that will be fed to the ENTRYPOINT

In the example below, when the container is run without an argument, it will execute echo "hello world".
If it is run with the argument nice it will execute echo "nice"

FROM ubuntu:18.04
ENTRYPOINT ["/bin/echo"]
CMD ["hello world"]

A more complex recipe (save it in a text file named Dockerfile:

FROM ubuntu:18.04

MAINTAINER Toni Hermoso Pulido <toni.hermoso@crg.eu>

WORKDIR ~

RUN apt-get update && apt-get -y upgrade
RUN apt-get install -y wget

ENTRYPOINT ["/usr/bin/wget"]
CMD ["https://cdn.wp.nginx.com/wp-content/uploads/2016/07/docker-swarm-hero2.png"]

docker build

Implicitely looks for a Dockerfile file in the current directory:

docker build .

Same as:

docker build --file Dockerfile .

Syntax: –file / -f

. stands for the context (in this case, current directory) of the build process. This makes sense if copying files from filesystem, for instance. IMPORTANT: Avoid contexts (directories) overpopulated with files (even if not actually used in the recipe).

You can define a specific name for the image during the build process.

Syntax: -t imagename:tag. If not defined :tag default is latest.

docker build -t mytestimage .

The last line of installation should be Successfully built …: then you are good to go.
Check with docker images that you see the newly built image in the list…

Then let’s check the ID of the image and run it!

docker images

docker run f9f41698e2f8
docker run mytestimage

docker run f9f41698e2f8 https://cdn-images-1.medium.com/max/1600/1*_NQN6_YnxS29m8vFzWYlEg.png

docker tag

To tag a local image with ID “e23aaea5dff1” into the “ubuntu_wget” image name repository with version “1.0”:

docker tag e23aaea5dff1 --tag ubuntu_wget:1.0

Build cache

Every line of a Dockerfile is actually an image/layer by itself.

Modify for instance the last bit of the previous image (let’s change the image URL) and rebuild it (even with a different name/tag):

FROM ubuntu:18.04

MAINTAINER Toni Hermoso Pulido <toni.hermoso@crg.eu>

WORKDIR ~

RUN apt-get update && apt-get -y upgrade
RUN apt-get install -y wget

ENTRYPOINT ["/usr/bin/wget"]
CMD ["https://cdn-images-1.medium.com/max/1600/1*_NQN6_YnxS29m8vFzWYlEg.png"]

docker build -t mytestimage2 .

It will start from the last line. This is OK most of the times and very convenient for testing and trying new steps, but it may lead to errors when versions are updated (either FROM image or included packages). For that it is benefitial to start from scratch with --no-cache tag.

docker build --no-cache -t mytestimage2 .

More advanced image building

Different ways to build images.

Know your base system and their packages. Popular ones:

Debian
CentOS
Alpine
Conda. Anaconda, Conda-forge, Bioconda, etc.

Additional commands

docker inspect: Get details from containers (both running and stopped). Things such as IPs, volumes, etc.
docker logs: Get console messages from running containers. Useful when using with web services.
docker commit: Turn a container into an image. It make senses to use when modifying container interactively. However this is bad for reproducibility if no steps are saved.

Good for long-term reproducibility and for critical production environments:

docker save: Save an image into a tar archive.
docker export: Save a container into a tar archive.
docker import: Import a tar archive into an image.

Exercises

We explore interactively the different examples in the container/docker folders.


Previous page	Home	Next page