Storage & Volumes
Containers are ephemeral by design—the writable layer vanishes on docker rm. Persistent data needs an explicit mount strategy: managed volumes for production state, bind mounts for dev workflows, tmpfs for secrets in RAM. Pick wrong and you get silent data loss, permission hell, or tenfold I/O slowdown on Docker Desktop.
Storage options
Docker offers four ways to expose storage inside a container. They differ in who manages the data, where it lives on the host, and whether it survives container deletion. Think of them as four parking garages: volumes are a managed lot, bind mounts are your driveway, tmpfs is a valet that forgets everything at shutdown, and named pipes are a direct phone line between processes.
The four mount types
| Type | Managed by | Host location | Survives docker rm | Best for |
|---|---|---|---|---|
| Volume | Docker daemon | /var/lib/docker/volumes/ | Yes (until docker volume rm) | Production databases, shared app data, backups |
| Bind mount | You (explicit host path) | Any host directory or file | Yes (data is on host filesystem) | Dev hot-reload, config injection, host tooling |
| tmpfs | Kernel (RAM) | Not on disk—memory only | No (gone when container stops) | Secrets, temp caches, sensitive scratch data |
| Named pipe / FIFO | OS IPC | Host socket or pipe path | N/A (IPC channel, not storage) | Docker API socket, syslog, sidecar IPC |
┌─── CONTAINER ─────────────────────────────────────────────┐ │ /app/data ←── volume (mydata) [Docker-managed] │ │ /src ←── bind mount ./src [host path] │ │ /run/secrets ← tmpfs [RAM, no disk] │ │ /var/run/docker.sock ← bind mount [named socket/pipe] │ ├───────────────────────────────────────────────────────────┤ │ Writable container layer (upperdir) — ephemeral CoW │ ├───────────────────────────────────────────────────────────┤ │ Read-only image layers (lowerdir) │ └───────────────────────────────────────────────────────────┘
Volumes — Docker-managed persistence
Volumes are the recommended default for persistent application data. Docker creates and tracks them; you reference them by name. They bypass the container's OverlayFS upperdir, so writes do not trigger copy-on-write penalties from image layers. Multiple containers can mount the same volume read-write or read-only.
Bind mounts — host path coupling
Bind mounts map a specific host path into the container. Docker does not manage the data—you do. The container sees the host inode directly. Powerful for development (edit locally, run in container) but couples deployment to host filesystem layout and UID/GID mapping.
tmpfs — memory-backed, ephemeral
tmpfs mounts store data in RAM (and swap). Nothing hits disk—ideal for credentials that must not persist, or high-churn temp files you do not want polluting the container layer. Size is bounded by --tmpfs size flags and available memory.
Named pipes and sockets
Not traditional storage, but a mount type worth knowing: binding a Unix socket or FIFO lets containers talk to host services. The classic example is mounting /var/run/docker.sock so a CI agent can spawn sibling containers—a powerful and dangerous pattern covered in security guides.
# Volume (named, managed)
docker run -d --name db -v pgdata:/var/lib/postgresql/data postgres:16
# Bind mount (host path)
docker run --rm -v $(pwd)/src:/app/src:ro node:20 npm test
# tmpfs (RAM only)
docker run --rm --tmpfs /run/secrets:rw,noexec,nosuid,size=64m myapp
# Named socket (IPC)
docker run -v /var/run/docker.sock:/var/run/docker.sock docker:cli
Volume vs bind mount: Volumes are portable across hosts and orchestrators—Compose and Swarm reference them by name. Bind mounts tie a container to a specific host path, breaking portability but enabling instant dev feedback loops. Architects standardize on volumes for stateful services; developers use bind mounts locally.
"Where does container data go when the container is removed?" — Only volumes and bind-mounted host paths survive. The writable container layer is deleted. Anonymous volumes survive until pruned, but have no stable name for reattachment.
Volumes deep dive
Volumes are first-class Docker objects with their own lifecycle, drivers, and CLI. Production teams treat them like mini databases—named, backed up, monitored for disk growth, and never left anonymous.
Volume lifecycle commands
| Command | Purpose | Example |
|---|---|---|
| docker volume create | Provision a named volume (optionally with driver/labels) | docker volume create --name app_logs --label env=prod |
| docker volume ls | List volumes; filter by dangling, driver, label | docker volume ls -f dangling=true |
| docker volume inspect | JSON metadata: mountpoint, driver, labels, scope | docker volume inspect pgdata |
| docker volume rm | Delete one or more volumes (must not be in use) | docker volume rm pgdata_old |
| docker volume prune | Remove all unused volumes (destructive—confirm in prod) | docker volume prune -f |
$ docker volume create --driver local --opt type=nfs \ --opt o=addr=10.0.1.50,rw --opt device=:/exports/pgdata pgdata_nfs pgdata_nfs $ docker volume inspect pgdata_nfs --format '{{.Mountpoint}}' /var/lib/docker/volumes/pgdata_nfs/_data $ docker run -d --name postgres -v pgdata_nfs:/var/lib/postgresql/data postgres:16 a3f8c2d1e9b0... $ docker volume ls --filter name=pgdata DRIVER VOLUME NAME local pgdata_nfs
Volume drivers
The default local driver stores data under Docker's data root. Plugin drivers extend storage to network filesystems and cloud block stores—critical for multi-host Swarm and legacy Docker EE deployments. Kubernetes uses its own CSI drivers instead, but the concepts transfer.
| Driver | Backing | Use case | Notes |
|---|---|---|---|
| local | Host disk (/var/lib/docker/volumes/) | Single-node Docker, Compose dev stacks | Default; fast on Linux; bind subdirs via --opt type=none --opt device=… --opt o=bind |
| NFS | Remote NFS export | Shared storage across Swarm nodes | --opt type=nfs --opt o=addr=… --opt device=:/path |
| Cloud plugins | EBS, EFS, Azure Disk, GCE PD | Cloud-native persistent disks | Install vendor plugin; prefer CSI in Kubernetes |
| Third-party | NetApp, Portworx, RexRay, etc. | Enterprise HA storage | Plugin lifecycle separate from Docker upgrades |
-v vs --mount syntax
Both attach storage, but --mount is explicit and recommended for production scripts. The shorthand -v is easy to misread—host vs container path order trips people up, and it silently creates host directories for bind mounts.
| Scenario | -v shorthand | --mount (preferred) |
|---|---|---|
| Named volume | -v mydata:/app/data | --mount source=mydata,target=/app/data |
| Read-only volume | -v mydata:/app/data:ro | --mount source=mydata,target=/app/data,readonly |
| Bind mount | -v /host/path:/container/path | --mount type=bind,source=/host/path,target=/container/path |
| tmpfs | --tmpfs /run:rw,noexec,size=64m | --mount type=tmpfs,target=/run,tmpfs-size=67108864,tmpfs-mode=1777 |
Anonymous volumes
When you specify only a container path—-v /var/lib/mysql with no name—Docker creates an anonymous volume with a random ID. Data survives container removal unless you pass -v on docker rm (v1) or use docker rm -v. These accumulate as orphaned disk usage. Official database images use anonymous volumes in their Dockerfile VOLUME instruction—always replace with a named volume in production.
Permissions and ownership
Volume data is written by the container process UID/GID. If the image runs as UID 999 (postgres) but the host directory is owned by root, you get permission denied. Fixes:
- docker run --user to match host ownership
- Init container or entrypoint script that chowns the mount (common in dev)
- COPY --chown=app:app in Dockerfile so the runtime user owns expected paths
- Named volume (Docker initializes with root; database entrypoints fix permissions on first start)
Backup with tar
The canonical volume backup pattern: mount the volume into a throwaway container and stream a tar archive to stdout. Restore is the reverse—extract tar into a fresh volume. Works for any volume driver whose data is accessible locally.
# Backup: stream volume to host tar file
docker run --rm \
-v pgdata:/source:ro \
-v $(pwd):/backup \
alpine tar czf /backup/pgdata-$(date +%Y%m%d).tar.gz -C /source .
# Restore: extract tar into a new volume
docker volume create pgdata_restored
docker run --rm \
-v pgdata_restored:/target \
-v $(pwd):/backup \
alpine sh -c "cd /target && tar xzf /backup/pgdata-20260605.tar.gz"
# Verify restored volume
docker run --rm -v pgdata_restored:/data alpine ls -la /data
Backing up a live database with plain tar can produce inconsistent snapshots. For PostgreSQL use pg_dump; for MySQL use mysqldump or Percona XtraBackup. Tar-on-volume is fine for file uploads, static assets, and stopped containers.
Platform teams run nightly docker run --rm -v $VOL:/source:ro alpine tar czf - piped to S3, with retention policies. Pair with docker volume ls -f dangling=true in CI cleanup jobs—anonymous volumes from integration tests are a top disk-usage leak on shared runners.
Declare named volumes at the top level so they survive docker compose down (without -v): volumes: pgdata: then services.db.volumes: [pgdata:/var/lib/postgresql/data]. Use external: true when a volume is provisioned outside Compose.
Bind mounts
Bind mounts are a direct window into the host filesystem. Edit a file in your IDE, save, and the running container sees the change instantly—no image rebuild. That speed comes with coupling, permission friction, and security exposure if you mount too much.
Dev hot reload pattern
The standard local development loop: bind-mount source code, keep dependencies in the image layer or a named volume. Spring Boot DevTools, Node nodemon, and Vite HMR all depend on the host writing files the container process watches.
# Spring Boot dev: mount source, cache Maven deps in named volume
docker run --rm -p 8080:8080 \
-v $(pwd)/src:/app/src \
-v maven_cache:/root/.m2 \
-w /app \
maven:3.9-eclipse-temurin-21 mvn spring-boot:run
# Node dev: mount everything except node_modules
docker run --rm -p 3000:3000 \
-v $(pwd):/app \
-v /app/node_modules \
-w /app node:20 npm run dev
The anonymous volume trick -v /app/node_modules masks the bind mount for that path—container-installed node_modules wins over the (empty or macOS-incompatible) host directory.
Read-only bind mounts (:ro)
Mount configuration and secrets read-only so a compromised container cannot modify them. Compose: ./config.yml:/app/config.yml:ro. Kubernetes equivalent is readOnly: true on volume mounts. Read-only does not prevent reading— sensitive files still need proper permissions and tmpfs for runtime secrets.
Permissions and UID mapping
| Symptom | Cause | Fix |
|---|---|---|
| Permission denied on write | Container UID ≠ host file owner | Run with --user $(id -u):$(id -g) in dev |
| Root-owned files on host after container exit | Container ran as root, wrote to bind mount | Use non-root USER in Dockerfile; fix with chown |
| SELinux Permission denied (RHEL/Fedora) | Container process lacks label for host path | Add :z (shared) or :Z (private) suffix |
| Docker Desktop Mac/Win slow file watch | FUSE/gRPC filesystem sync overhead | Named volume for node_modules; Mutagen/sync tools |
Risks of bind mounts
- Host path exposure — mounting / or /etc gives the container full host read (or write) access
- Non-portable deploys — production paths differ; bind mounts break across machines
- Accidental deletion — rm -rf inside the container can destroy host data
- Docker creates missing dirs — -v /nonexistent:/data creates /nonexistent as root on the host
When to use bind mounts
| Use case | Mount | Mode |
|---|---|---|
| Live code reload (dev) | ./src:/app/src | rw |
| Inject nginx/site config | ./nginx.conf:/etc/nginx/nginx.conf | ro |
| Share TLS certs from host | /etc/letsencrypt:/certs | ro |
| Log shipping to host agent | ./logs:/var/log/app | rw |
| Docker-in-Docker (CI) | /var/run/docker.sock | rw (high risk) |
Never bind-mount /var/run/docker.sock into application containers in production—it grants effective root on the host. Restrict to dedicated CI builders with gVisor or rootless Podman alternatives. Treat every bind mount as expanding the container's blast radius.
In Compose, use develop.watch (Compose 2.22+) to sync files without full directory bind mounts—reduces macOS FUSE overhead while keeping hot reload. For pure bind-mount workflows, exclude heavy dirs via anonymous volumes.
Container layer (writable layer)
Every running container gets a thin writable layer on top of read-only image layers. It is convenient scratch space—and a trap for anyone who treats it like persistent storage. OverlayFS copy-on-write makes every first-write to a lower-layer file expensive.
How the writable layer works
Image layers form the lowerdir stack. Docker creates a unique upperdir per container—all runtime writes land here. The merged view is the container root filesystem. When the container is removed, upperdir is deleted. No backup, no recovery.
READ path: upperdir → lowerN → … → lower0 (first hit wins)
WRITE path: file in lower? → COPY to upperdir (CoW) → write
new file? → create directly in upperdir
┌──────── upperdir (container writable layer) ────────┐
│ /app/logs/app.log ← created here at runtime │
│ /etc/hosts ← copied from lower, then edited │
├──────── lowerdir (image layers, read-only) ─────────┤
│ Layer 3: RUN npm run build │
│ Layer 2: COPY package.json │
│ Layer 1: FROM node:20-alpine │
└──────────────────────────────────────────────────────┘
merged → container sees unified /
Copy-on-write cost
CoW triggers when a container modifies a file that exists in a lower layer. Docker copies the entire file to upperdir before applying the change—even for a one-byte edit. High-churn workloads (database page writes, log rotation, package managers writing to /usr) amplify this cost. Reads are fast; first-writes are not.
| Operation | Layer impact | Performance |
|---|---|---|
| Read file from image | Lower layer only | Fast (page cache) |
| Create new file | Written to upperdir | Normal disk I/O |
| Modify existing image file | Full file CoW to upperdir | Slow for large files |
| Delete file from image | Whiteout marker in upperdir | Cheap metadata op |
| Heavy random writes (DB) | All in upperdir + CoW churn | Poor—use volumes instead |
Never run databases in the container layer
SQLite, PostgreSQL, MySQL, Redis persistence—all belong on volumes. Database engines mmap files and issue small random writes. On the container layer this means constant CoW copies, no crash-safe guarantees across container recreation, and data loss on docker rm. This is not theoretical—it is the #1 "my database disappeared" incident.
$ docker run -d --name web nginx:alpine f4a2b8c1d3e7... $ docker exec web sh -c 'echo custom > /usr/share/nginx/html/test.html' $ docker diff web C /usr C /usr/share C /usr/share/nginx C /usr/share/nginx/html A /usr/share/nginx/html/test.html $ docker inspect --format '{{.GraphDriver.Data.UpperDir}}' web /var/lib/docker/overlay2/a3f8…/diff
Understanding docker diff
docker diff <container> lists changes in the writable layer relative to the image:
- A — Added file or directory
- C — Changed (modified metadata or CoW'd content)
- D — Deleted (whiteout over lower-layer file)
Use it to debug "why is my container using 2 GB of disk?"—often log files, package caches, or database files written to the wrong path. Pair with docker system df -v for per-container writable layer size.
Writing logs to /var/log inside the container fills the writable layer and the host's Docker storage pool. Configure apps to log to stdout (twelve-factor) or mount a volume/bind for log files. Monitor with docker diff and docker system df.
On overlay2, each container's upperdir lives under /var/lib/docker/overlay2/<id>/diff. The work/ sibling directory is OverlayFS internal state. Volumes mount into the merged view, bypassing upperdir for that path—writes go directly to the volume mount point.
Storage performance
Storage choice directly affects latency, throughput, and developer experience. Linux-native Docker is fast; Docker Desktop on macOS and Windows pays a heavy tax for cross-VM filesystem synchronization. Architects who ignore this ship stacks that crawl locally but fly in production—or vice versa.
OverlayFS read vs write characteristics
Reads from image layers are served from the page cache like normal files—excellent for read-heavy workloads (serving static assets from the image). Writes to the container layer incur CoW overhead. Volume writes skip OverlayFS entirely—they hit the volume driver backing store (local ext4/xfs, NFS, cloud block) with native filesystem performance.
flowchart TB
subgraph reads["Read performance"]
IMG[Image layer read] --> FAST[Fast — page cache]
VOL_R[Volume read] --> FAST2[Fast — native FS]
BIND_R[Bind mount read] --> VAR[Variable — host FS + sync layer]
end
subgraph writes["Write performance"]
UPPER[Container layer write] --> COW[CoW penalty on first write]
VOL_W[Volume write] --> NATIVE[Native FS — best for DB]
BIND_W[Bind mount write] --> DESKTOP[Docker Desktop: FUSE/gRPC tax]
end
Volume vs bind mount on Linux
| Factor | Named volume | Bind mount |
|---|---|---|
| Linux native Docker I/O | Near-native (local driver) | Near-native (direct host path) |
| CoW interaction | Bypasses container upperdir | Bypasses container upperdir |
| Portability | Named, orchestrator-friendly | Host-path dependent |
| Permission setup | Docker-managed mountpoint | Host UID/GID must align |
| Backup ergonomics | Predictable path via inspect | You already know the path |
On bare-metal or VM Linux hosts, volume and bind performance is comparable—both avoid CoW. Choose based on operability, not raw IOPS.
Docker Desktop: macOS and Windows FUSE slowness
Docker Desktop runs containers inside a Linux VM. Bind mounts cross a filesystem synchronization boundary (VirtioFS, gRPC FUSE, or osxfs depending on version). File-heavy operations— npm install, mvn compile, webpack watchers— can be 5–10× slower than on native Linux.
- Store node_modules, .m2, .gradle in named volumes, not bind mounts
- Enable VirtioFS (Docker Desktop settings) for improved macOS bind performance
- Use :delegated or :cached mount consistency flags on older Desktop versions
- Consider devcontainers with source inside the VM volume, synced selectively
Spring Boot dev on Docker Desktop: bind-mount only src/; keep target/ and ~/.m2 in named volumes. Run mvn compile inside the container so bytecode writes stay in the VM. This cuts rebuild time from minutes to seconds on Mac.
Databases always use volumes
Regardless of platform, stateful data stores belong on volumes—not bind mounts, not the container layer:
- Crash consistency — volume data survives container restart and recreation
- I/O path — no CoW, no cross-VM sync on Desktop when volume lives in the Linux VM
- Backup — stable name for automation (pgdata, redis_data)
- Upgrade path — swap container image, reattach same volume
# Compose: production-shaped storage layout
services:
app:
image: myapp:1.0
volumes:
- app_logs:/var/log/myapp # named volume — logs
# NO bind mount of source in prod
postgres:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data # named volume — always
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/db_pass
secrets:
- db_pass
redis:
image: redis:7-alpine
volumes:
- redis_data:/data
command: redis-server --appendonly yes
volumes:
pgdata:
redis_data:
app_logs:
| Workload | Recommended storage | Why |
|---|---|---|
| PostgreSQL / MySQL / MongoDB | Named volume | IOPS, persistence, backup |
| Redis AOF/RDB | Named volume | Survive restarts; avoid CoW |
| Local dev source code | Bind mount (+ dep cache volume) | Hot reload; isolate slow dirs |
| Build caches (Maven, npm, cargo) | Named volume | Fast on Desktop; shared across runs |
| Runtime secrets | tmpfs or secrets mount | Never touch disk |
| Static assets in production | Bake into image | Read-only layers = fastest reads |
Dev speed vs prod fidelity: Bind mounts optimize developer iteration; volumes mirror production I/O paths. CI should test with volume-backed databases, not bind mounts or container-layer SQLite, to catch permission and persistence bugs before deploy.
"Why is my Docker setup slow on Mac but fast in production?" — Explain the VM boundary, FUSE/gRPC sync for bind mounts, and the fix: named volumes for I/O-heavy dirs, VirtioFS, minimize cross-boundary file churn. Mention CoW only applies to the container layer, not volumes.