Volumes in Docker: The Right Way to Handle Persistent Data

by Arif Ikhsanudin, Backend Developer

The database that lost everything

Your team runs a PostgreSQL container for local development. Someone runs docker rm -f postgres-container to fix a stuck container. The database and its data are gone. Not in a volume, not backed up, just gone — because the data lived inside the container filesystem.

This is the most fundamental thing to understand about Docker containers: the container filesystem is ephemeral. Everything written to the container's layered filesystem during runtime is discarded when the container is removed. This is a feature, not a bug — it's what makes containers reproducible. But it means data that needs to persist must be stored outside the container filesystem.

Docker provides two mechanisms for this: volumes and bind mounts.

Named volumes: Docker-managed persistence

A named volume is a storage area managed by Docker, stored on the host filesystem at a location Docker controls (/var/lib/docker/volumes/ on Linux). You refer to it by name rather than by host path.

# Create a named volume
docker volume create pg_data

# Run a container using the volume
docker run -d \
  --name postgres \
  -v pg_data:/var/lib/postgresql/data \
  postgres:16-alpine

The -v pg_data:/var/lib/postgresql/data syntax means: mount the pg_data volume at /var/lib/postgresql/data inside the container. PostgreSQL writes its data files there. When you remove and recreate the container, the volume persists and the data is intact.

Named volumes in Docker Compose:

services:
  db:
    image: postgres:16-alpine
    volumes:
      - pg_data:/var/lib/postgresql/data

volumes:
  pg_data:    # declares the volume at the Compose project level

The volumes: key at the service level mounts the volume. The volumes: key at the top level declares it as a Compose-managed volume. If the volume doesn't exist when docker compose up runs, Compose creates it. If it exists, Compose uses the existing one — data is preserved between docker compose down and docker compose up.

docker compose down does NOT delete volumes. To delete volumes with containers: docker compose down -v. This is intentional. You have to explicitly opt into data deletion.

When Docker initializes volumes

Postgres-specific behavior that surprises people: when you mount a named volume at /var/lib/postgresql/data, PostgreSQL checks if the directory is empty on startup. If it is, it initializes the database. If it contains existing PostgreSQL data files, it uses them.

This means the volume correctly persists the database across container restarts and removals. It also means: if you change the Postgres version in your Compose file (e.g., from 15 to 16), the existing volume has PostgreSQL 15 data files. PostgreSQL 16 may not be able to use them directly — you'll need to either recreate the volume or run a migration.

Bind mounts: host path to container path

A bind mount mounts a specific host filesystem path into the container:

docker run -v /home/user/myapp/config:/etc/myapp/config:ro my-app

/home/user/myapp/config on the host is mounted at /etc/myapp/config inside the container. The :ro suffix makes it read-only inside the container. Changes made on the host are immediately reflected inside the container, and vice versa.

In Compose:

services:
  app:
    volumes:
      - ./config:/etc/app/config:ro
      - ./src:/app/src     # for live reload in development

The path on the left of : is relative to the docker-compose.yml file's directory.

Bind mounts are the development tool — they let you edit code on the host and see changes reflected in the running container without rebuilding. They're not the right choice for database storage because:

  • Host path must exist; named volumes are created automatically
  • Permissions on the host filesystem may not match the container user
  • On Docker Desktop (Mac/Windows), bind mount performance is significantly slower than named volumes due to the VM layer

Volume mount behavior: what gets overwritten

When you mount a volume or bind mount at a path that already exists in the container image, the mount obscures the image content at that path. The image's original files at that path are not visible — what you see is the mount target's content.

This is usually what you want for data directories. It can be surprising when:

COPY config/ /app/config/

And then:

volumes:
  - ./local-config:/app/config

The local bind mount completely hides the /app/config/ that was copied in during the Docker build. The image's config files are invisible during this mount. If the local ./local-config directory is empty or missing files, those files don't exist in the running container.

For node_modules, this creates the classic issue:

volumes:
  - .:/app            # mounts entire project, including host's node_modules

If the host doesn't have node_modules (or has a different version), the container's npm install-produced node_modules is hidden. The fix:

volumes:
  - .:/app
  - node_modules:/app/node_modules   # named volume takes precedence for this path

Docker evaluates more specific mount paths as taking precedence over less specific ones. The named volume at /app/node_modules is more specific than the bind mount at /app, so the named volume is used for that path.

Volume drivers: when the default isn't enough

The default volume driver (local) stores data on the host's local disk. This is fine for single-host setups but doesn't work for distributed environments where multiple hosts need access to the same volume.

For Kubernetes, volumes are managed by the orchestrator (PersistentVolume, PersistentVolumeClaim). For Docker Swarm, you can use volume plugins like rclone, nfs, or cloud-provider plugins (EBS, Azure Disk, GCE Persistent Disk) to create volumes backed by network storage.

For a single-host production Compose setup, the local driver with named volumes is appropriate. For multi-host, don't use Docker volumes for shared data — use external storage (S3, RDS, managed Redis) and point your application at it.

Managing volumes

# List all volumes
docker volume ls

# Inspect a volume (shows mount point on host)
docker volume inspect pg_data

# Remove a specific volume (fails if in use by a container)
docker volume rm pg_data

# Remove all unused volumes — BE CAREFUL
docker volume prune

# Backup a volume by mounting it and copying
docker run --rm \
  -v pg_data:/data \
  -v $(pwd):/backup \
  alpine tar czf /backup/pg_data_backup.tar.gz -C /data .

# Restore
docker run --rm \
  -v pg_data:/data \
  -v $(pwd):/backup \
  alpine tar xzf /backup/pg_data_backup.tar.gz -C /data

docker volume prune removes volumes not currently mounted by any container. Run this regularly to reclaim disk space, but verify that volumes you want to keep are mounted before running it.

The pattern for production single-host

services:
  db:
    image: postgres:16-alpine
    volumes:
      - pg_data:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: ${DB_NAME}
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    restart: unless-stopped

volumes:
  pg_data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/postgres    # specific host path for backup tooling

Specifying the host path via driver_opts gives you a predictable location for backup scripts, rather than Docker's default path under /var/lib/docker/volumes/. The directory must exist before docker compose up.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Negotiating Contracts Without Feeling Awkward

Talking money doesn’t have to feel like a root canal. Negotiating contracts can be professional, clear, and even comfortable.

Read more

Spring Security in Practice — Authentication, Authorization, and the Filters That Run on Every Request

Spring Security is comprehensive and opaque until you understand its filter chain model. Here is how authentication and authorization actually work, how to configure each layer, and what runs on every request before your controller sees it.

Read more

Metrics and Alerts in Microservices: What You Should Actually Be Watching

Most microservices monitoring setups track the wrong things: CPU and memory dashboards while missing error rate spikes and latency degradations that users are experiencing right now. Here is what actually matters and how to alert on it.

Read more

Why “Simple Features” Are Often Not Simple

“It’s just a small feature” is one of the most expensive sentences in software. What looks simple on the surface often hides layers of complexity underneath.

Read more