12 Docker Commands Every Data Scientist Should Know
Looking to add Docker to your data science toolbox? Here’s a list of essential Docker commands to help you get started.
Image by Author
Working on a data science project is always exciting. However, it is not without challenges. Each project requires you to install a (possibly) long list of libraries and specific versions of each library. So wrapping your head around the project’s dependency can be quite challenging. Here’s where Docker can help.
Docker is a popular containerization technology. With Docker, you can package your data science application—along with the code and required dependency—into a portable artifact called the image. Thus Docker facilitates replication of the development environment and makes local development a breeze.
Here’s a list of essential Docker commands that’ll come in handy as you’re coding your way through your next project. We’ll work with images from Docker Hub, one of the most popular platforms to find, share, and manage container images.
1. docker pull
To the pull an image from the from Docker Hub, you can run the docker pull
command as shown:
docker pull <name-of-the-image>
For example, to pull the Python image from Docker Hub, you can run the following command:
docker pull python
By default, this command pulls the latest version of the image available. You can optionally add a tag to pull a specific version of the image.
Note: If you'd like to run the Docker commands as a user without superuser permissions, create the
docker
group and add the user to that group.
2. docker images
To view the list of all the downloaded images, you can run the docker images
command.
docker images
3. docker run
You can start a container from the downloaded image using the docker run command. After you’ve pulled the image from the registry, you can spin up a docker container, a running instance of the image, as shown:
docker run <name-of-the-image>
docker run [options] <name-of-the-image>
For example, you can use the -i option to launch an interactive Python REPL while starting the container, and the -t option assigns a pseudo-tty, as shown:
An image is a portable artifact and a container is a running instance of the image. This means you can run multiple containers from a single Docker image.
Image by Author
4. docker ps
You can run the docker ps
command to get a list of all the running containers.
docker ps
Note that there’s a CONTAINER ID
associated with each Docker container. Over the next few minutes, we’ll learn Docker commands to stop and restart containers, examine logs, and more. We’ll use the CONTAINER ID
of a particular container in those commands.
Suppose you ran a container in one of the previous sessions, and the container is not running anymore. In this case, you can run the docker ps
command with the -a
option. This will list all the containers: those that are currently running as well as those that were stopped previously.
docker ps -a
5. docker stop
You may sometimes need to stop a running container. To do so, run the docker stop
command.
docker stop <CONTAINER ID>
6. docker start
You can use the docker start
command to restart a previously stopped container. You can run the docker ps -a
command, grab the container ID, and then use it in the docker start
command to restart a container.
docker start <CONTAINER ID>
7. docker rmi
To remove a specific image, you can run the docker rmi
command.
docker rmi <name-of-the-image>
Running this command removes the image from your local development environment. The next time you’d like to start a container from the image, you’ll need to pull the image from DockerHub.
8. docker rm
To remove a container permanently from your development environment, you can run the docker rm
command. However it's recommended to ensure that the container is stopped before attempting to remove it.
docker rm <CONTAINER ID>
9. docker logs
The docker logs command can be especially helpful when debugging containers.
docker logs <CONTAINER ID>
10. docker exec
Using the docker exec
command, you can execute commands run inside a running container.
docker exec <CONTAINER ID> <COMMAND> <ARGS>
Try it yourself: As a quick exercise to sum up what you've learned, pull the official Bash image from Docker Hub. Next, try starting an interactive terminal session when spinning up the container, and run a basic Bash command.
11. docker version
To check the version of docker installed in your working environment, run the docker version
command:
docker version
12. docker info
The docker info
command provides more granular information on the system-wide installation of Docker.
docker info
Output of docker info (truncated)
Conclusion
I hope you found this tutorial on essential docker commands helpful. Once you’re familiar with Docker, you can try dockerizing your Python and data science applications. You can then push your application’s image to DockerHub. Other developers will then be able to pull your image and spin up containers—in their working environment—all with a single command.
Bala Priya C is a technical writer who enjoys creating long-form content. Her areas of interest include math, programming, and data science. She shares her learning with the developer community by authoring tutorials, how-to guides, and more.