For those just starting with Docker, I've written some posts on how to think like containers, which I believe to be an important mindset to using Docker, as well as using Docker in development, the right way.
However, if you want to move one step further on mastering Docker, it's important to understand how volumes and networking work in Docker.
In this post, I'll guide you through the Volume basics, leaving Networking for another post.
Containers are ephemeral
It's important to keep in mind that containers are ephemeral. Whenever the container finishes executing the command, it's completely "shut down".
=> docker run node ls
bin
boot
dev
etc
home
lib
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
It starts a new container using the image node and executes the command ls
(list files and directories) inside the container.
The command gives an output and is finished. Once the command is finished, the container is shut down and all its data is lost.
Executing a Javascript program
Let's suppose we have a file containing a very basic Javascript program:
helloWorld.js
greeting = (message) => {
console.log(message)
}
greeting("Hello, world")
How can we run this program using containers? One could try to do something like:
docker run node node helloWorld.js
...which raises the error:
Error: Cannot find module '/app/helloWorld.js'
...
That's because containers are isolated and do not share the same Host filesystem.
Sync data between Host and Container
We have to sync the file helloWorld.js
with the containers. Docker provides a way to mount volumes, which means:
I want to mount a directory or file from the Host to the Container, so every change made in the Host will be mirrored to the Container, and vice-versa.
You change the file/directory in the Host, the container will see the changes. You change the file/directory in the container, the Host will see the changes.
Let's sync the helloWorld.js
with the container:
docker run
-v $(pwd)/docker-101/helloWorld.js:/app/helloWorld.js
node
node /app/helloWorld.js
Hello, world
...where:
- pwd refers to the full pathname of the current directory
- -v {hostPath}:{containerPath} mounts the file/directory to the container
- node is the image
- node /app/helloWorld.js executes the command using the path defined inside the container,
/app/helloWorld.js
A real project
Suppose we have a project tree like this:
/docker-101
/src
/components
components.js
index.js
From inside the docker-101
directory, we want to run a container using the index.js
file, but all the files in the src
folder must be in the container app as well.
docker run
-v $(pwd):/app
node
node /app/index.js
All the files within the pwd
, current directory, will be mounted to the /app
within the container.
What if we want to enter the container and manipulate the files from there?
docker run
-it
-v $(pwd):/app
node
bash
...where:
- -it instructs Docker to keep the container terminal (bash/shell) open
- bash opens a new bash/shell inside the container
root@8dcbfa6d777c:/# cd /app
root@8dcbfa6d777c:/app# ls
index.js src
root@8dcbfa6d777c:/app# touch new-file.js
root@8dcbfa6d777c:/app# ls
index.js new-file.js src
root@8dcbfa6d777c:/app# exit
exit
Now, after exiting the container, if we perform ls
from the host, we get:
index.js new-file.js src
This type of volume is called Path Volume.
Using mount option
Another way to mount volumes is by using the --mount
option, which specifies the type of the mount, a source and a target.
docker run
--mount type=bind,source=$(pwd),target=/app
node
node /app/helloWorld.js
The mount
is more explicit, but in most cases using -v
should be good enough.
Another real example
Imagine we want to print the timestamp for current time. We can do this in a lot of ways, but as for now we're going to use the underscore library.
index.js
var _ = require('underscore');
console.log(_.now())
We execute docker run -it -v $(pwd):/app node node /app/index.js
and we get the following error:
Error: Cannot find module 'underscore'
Obviously, we have to add the underscore
dependency to our project. Let's do it using npm
:
docker run
-v $(pwd):/app
-w /app
node
npm add underscore
added 1 package, and audited 2 packages in 2s
Note the -w /app
, which means working directory, where we are telling Docker to execute the command from within that directory. In our case, the command npm add underscore
will be executed inside the /app
directory in the container.
Now, we can run our index.js
:
docker run
-v $(pwd):/app
-w /app
node
node index.js
1644018496251
Want some NPM cache?
Sometimes we need to speedup the npm process using cache. Inside the container, by default, the npm
stores cache at /root/.npm
. You can check that by running:
npm config get cache
=> /root/.npm
At this moment, we are only syncing the -v $(pwd):/app
volume. When the container writes the npm cache to /root/.npm
, we are not syncing back to the Host.
We learned that we can mount multiple volumes to the container, so let's do that:
docker run
-v $(pwd):/app
-v $(pwd)/npm_cache:/root/.npm <---- sync the NPM cache
-w /app
node
npm install underscore
When we perform ls
from the Host, we get the following:
index.js node_modules npm_cache package-lock.json package.json src
However, we do not want to keep track of "npm_cache" folder in our project. It's not our business.
What if we wanted to store such cache in another path in the Host, therefore we do not care where it's stored?
Docker provides a special type of Volume, where the path in the Host is chosen by Docker, hence the only thing we need to know is its name. Docker knows where it is located, but we do not care.
Yes, they are called Named Volumes.
docker volume create my-volume
docker volume inspect my-volume
[
{
"CreatedAt": "2022-02-05T00:47:59Z",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/my-volume/_data",
"Name": "my-volume",
"Options": {},
"Scope": "local"
}
]
The Mountpoint
is the exact path in the Host. So, every container using this volume will sync their data directly to this mountpoint, leaving us to manage and know the volume by using its name elsewhere, "my-volume".
In our example of underscore, we can start the container using named volumes too:
docker run
-v $(pwd):/app
-v npm_cache:/root/.npm <---- named volume in the Host
-w /app
node
npm install underscore
This way, Docker will create a volume called npm_cache
and mount it to the container at /root/.npm
.
Don't believe me? Go check yourself:
docker volume inspect npm_cache
[
{
"CreatedAt": "2022-02-05T00:51:24Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/npm_cache/_data",
"Name": "npm_cache",
"Options": null,
"Scope": "local"
}
]
Yay! Much clearer, as we keep the speedup of our npm process.
docker run
-v $(pwd):/app
-v npm_cache:/root/.npm
-w /app
node
node index.js
1644022388847
Wrapping up
In this post we learned how to use Path Volumes and Named Volumes in Docker.
Path volumes are very helpful when we want to keep track the path in the host, i.e mounting the entire project we are working on.
Named volumes are handy when we do not want to keep track the path in the host, but instead leveraging Docker to choose the mountpoint
for us, i.e using cache for NPM, Yarn, Ruby bundler, Python packages and so on.