Skip to content

Docker Compose — Advanced Cheat Sheet

Core Concepts

Docker Compose is a declarative orchestrator for multi-container applications on a single host. It maps directly to Docker Engine primitives — every docker compose command is syntactic sugar over Docker API calls.

Concept Maps To Notes
service Container template One service = N replicas
network Docker network Default: bridge driver
volume Docker volume / bind mount Persists outside container lifecycle
secret Encrypted file mount Swarm-native; Compose emulates via bind
config Non-sensitive file mount Similar to secret, plain text

Compose file lookup order (first match wins):

compose.yaml → compose.yml → docker-compose.yaml → docker-compose.yml

File Structure & Schema

# compose.yaml — top-level keys
name: myapp                  # Project name (default: directory name)

services:    { ... }         # Required — define containers
networks:    { ... }         # Optional — custom network config
volumes:     { ... }         # Optional — named volume declarations
configs:     { ... }         # Optional — non-secret config files
secrets:     { ... }         # Optional — sensitive data

Spec version — Compose V2 (shipped with Docker Desktop ≥ 3.x and Docker Engine ≥ 20.10) no longer requires the version: key. Drop it.

# Old — avoid
version: "3.9"

# Modern — no version key needed
name: myapp
services:
  web:
    image: nginx

Services

services:
api:
    image: node:20-alpine
    command: ["node", "dist/index.js"]
    working_dir: /app
    ports:
    - "3000:3000"          # HOST:CONTAINER
    restart: unless-stopped
services:
app:
    image: myrepo/app:1.2.3

    # --- Identity ---
    container_name: app-prod   # Avoid in scaled services
    hostname: app-node-1

    # --- Lifecycle ---
    restart: on-failure        # no | always | on-failure | unless-stopped
    stop_grace_period: 30s     # SIGTERM → wait → SIGKILL timeout
    init: true                 # Tini as PID 1 — prevents zombie processes

    # --- Execution ---
    user: "1000:1000"          # uid:gid — never run as root in prod
    working_dir: /app
    entrypoint: ["/docker-entrypoint.sh"]
    command: ["--config", "/etc/app/config.yaml"]
    read_only: true            # Immutable root FS
    tmpfs:
    - /tmp                   # Writable scratch space on read-only containers

    # --- Ports ---
    ports:
    - "127.0.0.1:8080:80"   # Bind to localhost only — don't expose to 0.0.0.0 in prod
    - target: 443
        host_ip: "0.0.0.0"
        published: "443"
        protocol: tcp
        mode: host             # Bypass network proxy; direct host port binding

    # --- Capabilities ---
    cap_drop: [ALL]            # Drop all Linux capabilities first
    cap_add: [NET_BIND_SERVICE] # Re-add only what's needed

    # --- Labels ---
    labels:
    app.version: "1.2.3"
    traefik.enable: "true"

Networking

Default Behavior

Every project gets a default bridge network named <project>_default. All services on it can resolve each other by service name as DNS.

# Service 'api' can reach 'db' at dns name 'db:5432' — no extra config
services:
  api:
    image: myapp
  db:
    image: postgres:16

Custom Networks

services:
  api:
    networks:
      - frontend
      - backend

  db:
    networks:
      - backend           # db is NOT reachable from frontend network

  nginx:
    networks:
      - frontend

networks:
  frontend:
    driver: bridge
  backend:
    driver: bridge
    internal: true        # No outbound internet — isolated network

Network Aliases & Static IPs

services:
  app:
    networks:
      app-net:
        aliases:
          - app.internal   # Additional DNS names
        ipv4_address: 172.20.0.10

networks:
  app-net:
    driver: bridge
    ipam:
      config:
        - subnet: 172.20.0.0/24
          gateway: 172.20.0.1

Connecting to External Networks

networks:
  shared-infra:
    external: true          # Must exist before compose up; not managed by Compose
    name: mycompany-infra

Ports & Expose

ports vs expose — Key Difference

Key Visible To Syntax Use Case
ports Host and other containers HOST:CONTAINER Public-facing services
expose Other containers on same network only Container port number Internal services, no host binding

ports — Full Syntax

services:
  web:
    ports:
      # Short syntax: "host:container"
      - "8080:80"

      # Bind to specific host interface (production best practice)
      - "127.0.0.1:8080:80"

      # Long syntax — explicit and unambiguous
      - target: 80           # Container port
        published: "8080"    # Host port (string to allow ranges)
        host_ip: "127.0.0.1" # Bind interface; omit to bind 0.0.0.0
        protocol: tcp        # tcp | udp
        mode: ingress        # ingress (default) | host

      # UDP port
      - "5353:5353/udp"

      # Port range — maps host 9000-9010 → container 9000-9010
      - "9000-9010:9000-9010"

      # Ephemeral host port — Docker picks an available host port
      - "80"

mode: host vs mode: ingress

services:
  app:
    ports:
      # ingress (default) — traffic goes through Docker's userland proxy
      # Adds ~microseconds of latency; works with --scale
      - target: 80
        published: "80"
        mode: ingress

      # host — bypasses userland proxy, binds directly to host NIC
      # Lower latency; incompatible with --scale > 1 (port conflict)
      - target: 80
        published: "80"
        mode: host

expose — Container-Only Ports

services:
  api:
    image: myapp/api
    expose:
      - "3000"    # Reachable as api:3000 from other services
      - "9229"    # Node.js debug port — internal only, never bind to host

  nginx:
    image: nginx
    ports:
      - "127.0.0.1:80:80"   # Only nginx is host-accessible
    depends_on:
      - api                  # Proxies to api:3000 internally

expose is documentation-level in Compose

For containers on the same Compose network, all ports are reachable between services regardless of expose. The expose key documents intent and is enforced only when using --ipc, --link (legacy), or certain security profiles. Always rely on network segmentation (separate networks) to truly isolate services.

Protocol-Specific Examples

services:
  gameserver:
    image: myapp/gameserver
    ports:
      - "7777:7777/udp"      # UDP-only game port
      - "27015:27015/tcp"    # TCP control port

  dns:
    image: pihole/pihole
    ports:
      - "53:53/tcp"
      - "53:53/udp"          # Same port, both protocols
      - "127.0.0.1:8053:80"  # Admin UI — localhost only

Don't Expose What Doesn't Need To Be

services:
  # ✅ Only the reverse proxy is exposed to the host
  traefik:
    ports:
      - "80:80"
      - "443:443"

  # ✅ App is internal — reachable by Traefik via service name, not host
  app:
    expose:
      - "3000"

  # ✅ DB never touches the host network
  postgres:
    expose:
      - "5432"
    # No ports: key at all

Volumes & Storage

Volume Types Compared

Type Declaration Use Case Persistence
Named volume volumes: top-level Databases, stateful apps Survives down; removed by down -v
Bind mount Absolute or relative path Dev hot-reload, config injection Host filesystem
tmpfs tmpfs: key Ephemeral scratch, secrets in RAM Gone on stop
Anonymous No name, inline Throw-away cache Removed with container
services:
  db:
    image: postgres:16
    volumes:
      # Named volume — recommended for databases
      - pgdata:/var/lib/postgresql/data

      # Bind mount — dev config injection
      - ./postgres/init.sql:/docker-entrypoint-initdb.d/init.sql:ro

      # tmpfs — secrets or scratch
      - type: tmpfs
        target: /dev/shm
        tmpfs:
          size: 134217728    # 128 MB in bytes

volumes:
  pgdata:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/postgres  # Map named volume to specific host path

Advanced Volume Options

volumes:
  # NFS mount via named volume
  nfs-data:
    driver: local
    driver_opts:
      type: nfs
      o: "addr=192.168.1.100,rw,nfsvers=4"
      device: ":/exports/data"

  # External volume (pre-created, not managed by Compose)
  existing-data:
    external: true
    name: legacy-database-volume

Environment & Secrets

Environment Variables — Precedence (high → low)

Shell environment → .env file → environment: block → image defaults
services:
  app:
    image: myapp
    environment:
      # Literal value
      NODE_ENV: production
      PORT: "3000"

      # Pass-through from host shell (no value = take from shell)
      AWS_ACCESS_KEY_ID:
      AWS_SECRET_ACCESS_KEY:

    # Load from file
    env_file:
      - .env
      - .env.local          # Merged; later files override earlier ones

.env file (auto-loaded for variable substitution in compose.yaml):

# .env
POSTGRES_VERSION=16
APP_PORT=3000
REGISTRY=ghcr.io/myorg
# compose.yaml — uses .env values
services:
  db:
    image: postgres:${POSTGRES_VERSION:-15}   # With fallback default
  app:
    image: ${REGISTRY}/app:${TAG:?TAG must be set}  # Fail fast if missing
    ports:
      - "${APP_PORT}:3000"

Secrets

services:
  app:
    secrets:
      - db_password
      - api_key

    # Secrets mounted at /run/secrets/<name> by default
    environment:
      DB_PASSWORD_FILE: /run/secrets/db_password  # App reads file, not env var

secrets:
  db_password:
    file: ./secrets/db_password.txt    # Dev: read from file
  api_key:
    environment: API_KEY               # Source from host environment variable

Secret handling best practice

Never put secrets in environment: blocks — they appear in docker inspect output. Use secret mounts or dedicated secret managers (Vault, AWS Secrets Manager) in production.


Health Checks & Dependencies

Defining Health Checks

services:
  db:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
      interval: 10s          # Time between checks
      timeout: 5s            # Check must complete within this
      retries: 5             # Failures before marking unhealthy
      start_period: 30s      # Grace period — failures during this don't count
      start_interval: 2s     # Poll rate during start_period (Compose v2.20+)

  redis:
    image: redis:7-alpine
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 3

Startup Ordering with depends_on

services:
  app:
    image: myapp
    depends_on:
      db:
        condition: service_healthy    # Wait for health check to pass
        restart: true                 # Restart app if db restarts (Compose v2.17+)
      redis:
        condition: service_healthy
      migrations:
        condition: service_completed_successfully  # One-shot job must exit 0

  migrations:
    image: myapp
    command: ["npm", "run", "migrate"]
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:16
    healthcheck: { ... }

Warning

depends_on controls startup order, not application-level readiness inside the container. Always implement retry logic in your application.


Build Configuration

Basic Build

services:
  app:
    build:
      context: .             # Build context (directory sent to daemon)
      dockerfile: Dockerfile.prod

Advanced Build Args, Targets & Cache

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
      target: production     # Multi-stage target
      args:
        NODE_VERSION: "20"
        BUILD_DATE: ${BUILD_DATE}
      cache_from:
        - type=registry,ref=ghcr.io/myorg/app:cache
        - type=local,src=/tmp/buildkit-cache
      cache_to:
        - type=inline
      platforms:
        - linux/amd64
        - linux/arm64
      labels:
        org.opencontainers.image.source: https://github.com/myorg/app
      shm_size: "256m"        # Larger /dev/shm for build (e.g., pytest)

Multi-stage Dockerfile Pattern

# Dockerfile
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM node:20-alpine AS build
WORKDIR /app
COPY . .
RUN npm ci && npm run build

FROM node:20-alpine AS production
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
USER node
CMD ["node", "dist/index.js"]

Profiles & Conditional Services

Profiles let you define services that only start when explicitly requested — ideal for dev tools, debug containers, or optional dependencies.

services:
  app:
    image: myapp
    # No profile — always starts

  adminer:
    image: adminer
    profiles: [tools]        # Only with: docker compose --profile tools up
    ports:
      - "8080:8080"

  debug-proxy:
    image: mitmproxy/mitmproxy
    profiles: [debug]

  load-test:
    image: grafana/k6
    profiles: [perf]
    command: run /scripts/load-test.js
    volumes:
      - ./k6:/scripts
# Start base services
docker compose up -d

# Start with tools profile
docker compose --profile tools up -d

# Multiple profiles
docker compose --profile tools --profile debug up -d

# Via environment variable
COMPOSE_PROFILES=tools,debug docker compose up -d

Scaling & Resource Limits

Replicas

services:
  worker:
    image: myworker
    deploy:
      replicas: 4
      # Note: most deploy options only apply in Swarm mode
      # For standalone Compose, use --scale flag
# Override replicas at runtime
docker compose up -d --scale worker=4

Resource Limits (Compose Standalone)

services:
  app:
    image: myapp
    deploy:
      resources:
        limits:
          cpus: "1.5"         # Max 1.5 CPU cores
          memory: 512M
          pids: 100           # Max process count
        reservations:
          cpus: "0.5"         # Guaranteed minimum
          memory: 256M

Note

deploy.resources works in standalone Compose (not just Swarm) as of Compose v2.


Extends, Anchors & Overrides

YAML Anchors (DRY config within one file)

x-common-logging: &logging
  driver: json-file
  options:
    max-size: "10m"
    max-file: "3"

x-app-base: &app-base
  restart: unless-stopped
  networks:
    - backend
  logging: *logging

services:
  api:
    <<: *app-base             # Merge anchor
    image: myapp/api:latest
    ports:
      - "3000:3000"

  worker:
    <<: *app-base
    image: myapp/worker:latest

extends — Inherit Across Files

base.yaml
services:
  app:
    image: myapp
    environment:
      LOG_LEVEL: info
    deploy:
      resources:
        limits:
          memory: 256M
compose.yaml
services:
  app:
    extends:
      file: base.yaml
      service: app
    environment:
      LOG_LEVEL: debug        # Override inherited value
    ports:
      - "3000:3000"           # Add new config

Compose File Overrides

# docker compose merges files in order — later files override earlier
docker compose -f compose.yaml -f compose.prod.yaml up -d
# compose.prod.yaml — only overrides what differs
services:
  app:
    image: myapp:${TAG:?}
    environment:
      NODE_ENV: production
    deploy:
      resources:
        limits:
          memory: 1G

COMPOSE_FILE Environment Variable

# .env
COMPOSE_FILE=compose.yaml:compose.prod.yaml

CLI Quick Reference

Lifecycle

# Start (detached), build if needed
docker compose up -d --build

# Stop containers, preserve volumes
docker compose down

# Stop AND remove volumes (destructive!)
docker compose down -v

# Restart single service without full down/up
docker compose restart app

# Remove stopped containers
docker compose rm -f

Inspection

# Running services and their ports
docker compose ps

# Follow logs (all services)
docker compose logs -f

# Follow logs for specific service
docker compose logs -f --tail=100 app

# Inspect resolved compose config (after variable substitution)
docker compose config

# Inspect service environment
docker compose exec app env

Running Commands

# Exec in running container
docker compose exec app sh

# Run one-off command in new container (inherits service config)
docker compose run --rm app npm run migrate

# Override entrypoint
docker compose run --rm --entrypoint sh app

Build & Push

# Build all services
docker compose build --no-cache --parallel

# Build specific service
docker compose build app

# Push all images
docker compose push

# Pull latest images
docker compose pull

Advanced

# Dry-run: see what would change
docker compose up --dry-run

# Wait for services to be healthy before returning
docker compose up -d --wait

# Scale specific service
docker compose up -d --scale worker=8

# Show resource usage
docker compose top
docker stats $(docker compose ps -q)

# Copy files to/from container
docker compose cp app:/app/logs ./local-logs

Production Patterns

Pattern 1: Reverse Proxy with Traefik

services:
  traefik:
    image: traefik:v3
    command:
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.letsencrypt.acme.email=ops@example.com"
      - "--certificatesresolvers.letsencrypt.acme.storage=/acme/acme.json"
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - traefik-acme:/acme

  api:
    image: myapp/api:${TAG}
    labels:
      traefik.enable: "true"
      traefik.http.routers.api.rule: "Host(`api.example.com`)"
      traefik.http.routers.api.entrypoints: "websecure"
      traefik.http.routers.api.tls.certresolver: "letsencrypt"
      traefik.http.services.api.loadbalancer.server.port: "3000"

volumes:
  traefik-acme:

Pattern 2: Database with Init Scripts & Backups

services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
      POSTGRES_DB: ${DB_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./db/init:/docker-entrypoint-initdb.d:ro   # Init scripts run on first start
    secrets:
      - db_password
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
      interval: 10s
      retries: 5
      start_period: 20s

  pgbackup:
    image: prodrigestivill/postgres-backup-local
    environment:
      POSTGRES_HOST: postgres
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
      POSTGRES_DB: ${DB_NAME}
      SCHEDULE: "@daily"
      BACKUP_KEEP_DAYS: 7
    volumes:
      - ./backups:/backups
    secrets:
      - db_password
    depends_on:
      postgres:
        condition: service_healthy

Pattern 3: Zero-Downtime Redeploy Script

#!/usr/bin/env bash
# deploy.sh
set -euo pipefail

TAG=${1:?Usage: deploy.sh <tag>}
export TAG

echo "Pulling new image..."
docker compose pull app

echo "Deploying with zero-downtime scale..."
docker compose up -d --scale app=2 --no-recreate

echo "Waiting for new instance to be healthy..."
sleep 15

echo "Removing old instance..."
docker compose up -d --scale app=1 --no-recreate

echo "Deploy complete. Running containers:"
docker compose ps

Pattern 4: Init Container Pattern

services:
  # Init container — runs first, must exit 0
  migrate:
    image: myapp:${TAG}
    command: ["npm", "run", "db:migrate"]
    environment:
      DATABASE_URL: ${DATABASE_URL}
    depends_on:
      db:
        condition: service_healthy
    restart: "no"             # Don't restart if migrations fail — fail loudly

  app:
    image: myapp:${TAG}
    depends_on:
      migrate:
        condition: service_completed_successfully
      db:
        condition: service_healthy

Pattern 5: Observability Stack

services:
  app:
    image: myapp
    environment:
      OTEL_EXPORTER_OTLP_ENDPOINT: http://otel-collector:4317

  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    volumes:
      - ./otel-config.yaml:/etc/otel/config.yaml
    command: ["--config=/etc/otel/config.yaml"]

  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus-data:/prometheus
    ports:
      - "127.0.0.1:9090:9090"

  grafana:
    image: grafana/grafana:latest
    environment:
      GF_SECURITY_ADMIN_PASSWORD__FILE: /run/secrets/grafana_pass
    volumes:
      - grafana-data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning:ro
    ports:
      - "127.0.0.1:3001:3000"
    secrets:
      - grafana_pass

Common Gotchas

1. Port binding 0.0.0.0 vs 127.0.0.1

# ❌ Exposes port to all interfaces — dangerous in cloud VMs
ports:
  - "5432:5432"

# ✅ Only accessible locally
ports:
  - "127.0.0.1:5432:5432"

2. Volume mount shadows container content

# If ./app doesn't have node_modules, this silently removes them
volumes:
  - ./app:/app   # Bind mount hides container's /app/node_modules

Fix: Use an anonymous volume to "protect" node_modules:

volumes:
  - ./app:/app
  - /app/node_modules   # Anonymous volume — not overwritten by bind mount

3. .env is for Compose variable substitution, not container env

.env values are interpolated into compose.yaml. They are not automatically injected as container environment variables unless you reference them in environment:.

# .env: SECRET_KEY=abc123

services:
  app:
    # ❌ Container does NOT get SECRET_KEY automatically
    image: myapp

    # ✅ Explicit pass-through
    environment:
      SECRET_KEY: ${SECRET_KEY}

4. restart: always vs unless-stopped

Policy Restarts on Survives docker stop Survives daemon restart
always Any exit
unless-stopped Any exit
on-failure Non-zero exit Only if was running

Use unless-stopped for most production services.

5. command overrides, entrypoint replaces

# ENTRYPOINT ["/bin/sh"] CMD ["default.sh"]

command: custom.sh          # → /bin/sh custom.sh  (CMD replaced)
entrypoint: ["/usr/bin/env"] # → /usr/bin/env  (ENTRYPOINT replaced, CMD ignored)

6. Networking between docker compose projects

Services in different Compose projects cannot resolve each other by default. Use a shared external network:

docker network create shared-net
# Project A & B
networks:
  shared-net:
    external: true

7. depends_on does not wait for application readiness

The service_healthy condition waits for the Docker health check — not for your app to be actually serving traffic. Always implement retry/backoff in application startup code.

8. Build context size matters

Docker sends the entire context to the daemon. A missing or permissive .dockerignore makes builds slow and images large.

# .dockerignore — always include this
.git
node_modules
dist
*.log
.env*
.DS_Store
coverage
__pycache__

Quick Reference Card

# The 6 commands you'll use 90% of the time
docker compose up -d --build          # Build + start everything
docker compose down -v                # Teardown including volumes
docker compose logs -f app            # Follow service logs
docker compose exec app sh            # Shell into running container
docker compose run --rm app <cmd>     # One-off command
docker compose ps                     # Service status