Skip to content
Snippets Groups Projects
user avatar
Michael Marsh authored
84bc0dfc
History
Name Last commit Last update
README.md

We're going to be using docker in this course, though not as intensively as git. Still, it's worth taking some time to familiarize ourselves with it, especially since you're unlikely to be familiar with it.

What is docker? You can think of it like a lightweight VM. It's really considerably different, because it uses the host processor, memory, network stack, etc., without creating virtual hardware. We can throw around terms like user-level filesystems, process groups, and network namespaces, but the important part is that you can run a self-contained guest Linux OS within another host Linux OS, with applications and all of their dependencies. The guest can only see the resources given to it by the host, so it provides some (minimal) level of security. It also means we can start a process from a known-clean state, so we have repeatability.

Installation

The first thing we need to do is install docker. If you're running Linux, there's a good chance that your package manager already has docker available (don't confuse it with a KDE package of the same name!), but for the most up-to-date version, you can download it from https://docker.com. One slight complication is if you're running Red Hat Enterprise Linux; Fedora and CentOS are just fine. There's a special version of docker that works with RHEL, but it doesn't work as easily. At this point, you can ignore the rest of this section.

If you're running MacOS, then there's a download available from https://docker.com called Docker Desktop. It installs and runs easily. At this point, you can ignore the rest of this section.

If you're running Windows, life becomes more complicated. We're going to restrict ourselves to Docker Desktop under Windows 10 Education (you can get a license from TerpWare, and all it does is enable features already present). Can you run docker with the regular home edition? Yes, but you'll have to run Docker Toolbox, which isn't as well-integrated, so some things won't work properly.

The next thing you need to do is ensure you're running at least version 2004, which supports Windows Subsystem for Linux version 2 (WSL 2).

  • Go to https://aka.ms/wslstore and get a WSL Linux distribution. Ubuntu is a good choice.

  • Install https://wslstorestorage.blob.core.windows.net/wslblob/wsl_update_x64.msi

  • In an Admin PowerShell, run the following:

    • dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart
    • You might need to restart at this point.
    • wsl --set-default-version 2
    • wsl --set-version Ubuntu 2
  • There's a docker service icon at the bottom right (it's a whale) -- right-click on it and select "Settings:

    • Enable WSL 2 as the engine, instead of Hyper-V. This allows docker to take advantage of the Windows/Linux integration in the OS.
    • Expose the TCP daemon on localhost without TLS.
  • For convenience, I suggest doing the following in the Ubuntu shell:

    ln -s "/mnt/c/Users/<your username>" winhome

    That will allow you to access your Windows home directory from Ubuntu as ~/winhome/.

You should now be able to run all docker commands from either PowerShell or the WSL Ubuntu (or other distribution) shell.

Docker Images

Let's start with the concept of an image. This is the self-contained guest Linux OS, which is configured to automatically run some process when it starts. Nothing is running in it -- you can think of it like a hard drive.

The easiest way to get an image is to pull it from a registry. Docker has a default registry built in. We have, at times, used a course VM that is running Ubuntu 16.04 for a common baseline, and it turns out there's an image available with this OS on it! Here's the command to run:

docker pull ubuntu:16.04

Let's go through this command. "docker" is, of course, the utility we're using. The "pull" command tells us that we want to get something from a registry. In this case, we're getting the "ubuntu" image from the default registry. If we just left it at this, we'd get all of the ubuntu variants. Instead, we add ":16.04". That tells docker we only want one image, and it's the one with the tag "16.04".

When the command completes, try running

docker images

You should see something like:

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
ubuntu              16.04               2a4cca5ac898        28 hours ago        111MB

Most of this should be fairly self-explanatory. The image ID is another hexadecimal number, like with git, but it's clearly not a SHA-1 hash. It really doesn't matter what it is, other than a unique identifier for this image.

We can do a few things with this image, aside from running it. Try the following:

docker tag ubuntu:16.04 my_ubuntu
docker images

Note that we now see the same image ID twice, but with different names. By default, a repository (the tagless part of an image name) is tagged as "latest" if you don't specify one. Let's try specifying a tag, though:

docker tag ubuntu:16.04 foo:bar
docker images

The results should not be surprising.

We can quickly build up a lot of images we don't want anymore, so it's good to know how to clean these up. Let's get rid of our new tagged images:

docker rmi my_ubuntu:latest foo:bar
docker images

A common problem is that we'll end up reusing an old tag, leaving an image with no repository:tag name. These show up as ":". We can get rid of all of these with the following bash one-liner:

docker images -a | grep none | awk '{print $3}' | xargs docker rmi

For the curious, feel free to read the man pages for awk and xargs. This is not going to be essential information for this course, though.

The commands here are largely from an older version of docker. Now they're aliases to new-style commands. Here's the mapping:

Old Command New Command
docker images docker image list
docker pull docker image pull
docker rmi docker image rm
docker tag docker image tag

Running an Image in a Container

Images are all fine and good, but we actually want to use docker to do something, which means we have to run these images. An image runs in a container. The container has system resources allocated to it, and runs a program or programs that exist in the image. A container runs a single image, but an image may be running in multiple containers.

Containers can also be started with various options, such as elevated privileges, mounted volumes, environment variables, and so on. The most basic invocation is

docker run ubuntu:16.04

If you run this, you'll find that it pauses for a second or so, and then returns to the command line. If you want to see running containers, run

docker ps

You see headings, but probably no actual containers. Now, try

docker ps -a

Now we have something! Here's an example of what you might see:

CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS                          PORTS               NAMES
1b937126d5bc        ubuntu:16.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       upbeat_archimedes

Let's parse this out:

  • The container ID is a unique ID, like the image ID we saw before
  • The image should be self-explanatory
  • The command is what the container ran. In this case, it's just bash
  • The created time is when the container was started
  • The status tells us that this container exited, and is no longer running
  • We have no ports bound, but if we did these would map from local network ports to network ports on the container
  • The names are symbolic names used to refer to this container, and are synonyms for the container ID

By default, names are assigned randomly according to the pattern _

We can assign a name to the container, which is often useful:

docker run --name=bash_test ubuntu:16.04

This will behave similarly to the previous command, but if we run

docker ps -a

We'll now see our container named "bash_test" along with whatever random name our first container was assigned.

Usually, an image is defined to do something useful when run non-interactively. We can get interactive access to the container, though, as follows:

docker run -ti ubuntu:16.04

We've passed two new options to docker run. The "-t" option allocates a pseudo-TTY, and the "-i" option makes the container interactive. You should now have a shell on the container running as root! If you run "docker ps" in another terminal, you will see that the container status is "Up "

When you're done playing around in this shell, exit to stop the container.

At this point, you probably want to get rid of these stopped containers. Run:

docker rm bash_test
docker ps -a

You'll still have the two randomly-named containers, but the one named "bash_test" should no longer be present. Remove the other two, as well.

We don't have to run the configured program in a container; we can run any command that's present on the image. Let's see this in action:

docker run ubuntu:16.04 /bin/date

That should print the date in the container. It's probably in UTC, while running /bin/date (or the equivalent) on your computer should print the date in Eastern US time (EST or EDT). You can also specify options:

docker run ubuntu:16.04 ls /var

Another very useful option is "--rm", which will get rid of the container once it stops:

docker run --rm --name="rm_test" ubuntu:16.04 ls /var

We've once again been using old-style commands, which are aliases:

Old Command New Command
docker run docker container run
docker ps docker container ls

Stopping a Running Container

A container might become unresponsive, or it might be a long-running service that you want to terminate. You can do this with either of the following:

docker kill <container>
docker stop <container>

"stop" is more graceful, trying SIGTERM first, and then SIGKILL. "kill" sends SIGKILL by default, but this can be overridden on the command line.

Old Command New Command
docker kill docker container kill
docker stop docker container stop

Removing Stopped Containers

As with images, you'll tend to accumulate lots of stopped containers, unless you've run them all with the "--rm" option. Fortunately, we can get rid of these with

docker rm <container>

which is now an alias for

docker container rm <container>

Other Options for Running Containers

Here are some useful options you might want to use:

Option Argument Effect
--rm removes container after exit
-ti run interactively with a pTTY
-e set environment variables
-h set the container's hostname
-p : map host's to container's
-v : mount host's on

Executing Commands in a Running Container

Sometimes you need to examine what's going on inside a container. That's where the exec command can come in handy. It's a lot like run but for a container, rather than an image. Here's a common thing you might want to do:

docker run --name=svc_instance my_service:latest
docker exec -ti svc_instance /bin/bash

What this does is to first start a container using the latest version of the image my_service, and name the container svc_instance, and then to execute an interactive bash shell on that container. You don't have to exec an interactive command, though. There may be times when you want to run something like:

docker exec svc_instance touch /var/cache/magic_file

in order to change the behavior of a running process. As with the other commands we've looked at, "docker exec" is now an alias for "docker container exec".

Getting Process Output

Many processes send their output to STDOUT or STDERR. Since there's no TTY available to the process in a container, this output would generally be lost. Docker saves this output for you, however, and you can retrieve these by running

docker logs <container>
docker container logs <container>

The first command is now an alias for the second command. There are a number of options, such as "--since" to limit the timeframe of the logs returned, "-f" to continue to follow the logs rather than just dumping their current contents and exiting, and "-t" to show timestamps at the beginnings of lines.