Docker Compose — Advanced Cheat Sheet¶
Core Concepts¶
Docker Compose is a declarative orchestrator for multi-container applications on a single host. It maps directly to Docker Engine primitives — every docker compose command is syntactic sugar over Docker API calls.
| Concept | Maps To | Notes |
|---|---|---|
service |
Container template | One service = N replicas |
network |
Docker network | Default: bridge driver |
volume |
Docker volume / bind mount | Persists outside container lifecycle |
secret |
Encrypted file mount | Swarm-native; Compose emulates via bind |
config |
Non-sensitive file mount | Similar to secret, plain text |
Compose file lookup order (first match wins):
File Structure & Schema¶
# compose.yaml — top-level keys
name: myapp # Project name (default: directory name)
services: { ... } # Required — define containers
networks: { ... } # Optional — custom network config
volumes: { ... } # Optional — named volume declarations
configs: { ... } # Optional — non-secret config files
secrets: { ... } # Optional — sensitive data
Spec version — Compose V2 (shipped with Docker Desktop ≥ 3.x and Docker Engine ≥ 20.10) no longer requires the version: key. Drop it.
# Old — avoid
version: "3.9"
# Modern — no version key needed
name: myapp
services:
web:
image: nginx
Services¶
services:
app:
image: myrepo/app:1.2.3
# --- Identity ---
container_name: app-prod # Avoid in scaled services
hostname: app-node-1
# --- Lifecycle ---
restart: on-failure # no | always | on-failure | unless-stopped
stop_grace_period: 30s # SIGTERM → wait → SIGKILL timeout
init: true # Tini as PID 1 — prevents zombie processes
# --- Execution ---
user: "1000:1000" # uid:gid — never run as root in prod
working_dir: /app
entrypoint: ["/docker-entrypoint.sh"]
command: ["--config", "/etc/app/config.yaml"]
read_only: true # Immutable root FS
tmpfs:
- /tmp # Writable scratch space on read-only containers
# --- Ports ---
ports:
- "127.0.0.1:8080:80" # Bind to localhost only — don't expose to 0.0.0.0 in prod
- target: 443
host_ip: "0.0.0.0"
published: "443"
protocol: tcp
mode: host # Bypass network proxy; direct host port binding
# --- Capabilities ---
cap_drop: [ALL] # Drop all Linux capabilities first
cap_add: [NET_BIND_SERVICE] # Re-add only what's needed
# --- Labels ---
labels:
app.version: "1.2.3"
traefik.enable: "true"
Networking¶
Default Behavior¶
Every project gets a default bridge network named <project>_default. All services on it can resolve each other by service name as DNS.
# Service 'api' can reach 'db' at dns name 'db:5432' — no extra config
services:
api:
image: myapp
db:
image: postgres:16
Custom Networks¶
services:
api:
networks:
- frontend
- backend
db:
networks:
- backend # db is NOT reachable from frontend network
nginx:
networks:
- frontend
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true # No outbound internet — isolated network
Network Aliases & Static IPs¶
services:
app:
networks:
app-net:
aliases:
- app.internal # Additional DNS names
ipv4_address: 172.20.0.10
networks:
app-net:
driver: bridge
ipam:
config:
- subnet: 172.20.0.0/24
gateway: 172.20.0.1
Connecting to External Networks¶
networks:
shared-infra:
external: true # Must exist before compose up; not managed by Compose
name: mycompany-infra
Ports & Expose¶
ports vs expose — Key Difference¶
| Key | Visible To | Syntax | Use Case |
|---|---|---|---|
ports |
Host and other containers | HOST:CONTAINER |
Public-facing services |
expose |
Other containers on same network only | Container port number | Internal services, no host binding |
ports — Full Syntax¶
services:
web:
ports:
# Short syntax: "host:container"
- "8080:80"
# Bind to specific host interface (production best practice)
- "127.0.0.1:8080:80"
# Long syntax — explicit and unambiguous
- target: 80 # Container port
published: "8080" # Host port (string to allow ranges)
host_ip: "127.0.0.1" # Bind interface; omit to bind 0.0.0.0
protocol: tcp # tcp | udp
mode: ingress # ingress (default) | host
# UDP port
- "5353:5353/udp"
# Port range — maps host 9000-9010 → container 9000-9010
- "9000-9010:9000-9010"
# Ephemeral host port — Docker picks an available host port
- "80"
mode: host vs mode: ingress¶
services:
app:
ports:
# ingress (default) — traffic goes through Docker's userland proxy
# Adds ~microseconds of latency; works with --scale
- target: 80
published: "80"
mode: ingress
# host — bypasses userland proxy, binds directly to host NIC
# Lower latency; incompatible with --scale > 1 (port conflict)
- target: 80
published: "80"
mode: host
expose — Container-Only Ports¶
services:
api:
image: myapp/api
expose:
- "3000" # Reachable as api:3000 from other services
- "9229" # Node.js debug port — internal only, never bind to host
nginx:
image: nginx
ports:
- "127.0.0.1:80:80" # Only nginx is host-accessible
depends_on:
- api # Proxies to api:3000 internally
expose is documentation-level in Compose
For containers on the same Compose network, all ports are reachable between services regardless of expose. The expose key documents intent and is enforced only when using --ipc, --link (legacy), or certain security profiles. Always rely on network segmentation (separate networks) to truly isolate services.
Protocol-Specific Examples¶
services:
gameserver:
image: myapp/gameserver
ports:
- "7777:7777/udp" # UDP-only game port
- "27015:27015/tcp" # TCP control port
dns:
image: pihole/pihole
ports:
- "53:53/tcp"
- "53:53/udp" # Same port, both protocols
- "127.0.0.1:8053:80" # Admin UI — localhost only
Don't Expose What Doesn't Need To Be¶
services:
# ✅ Only the reverse proxy is exposed to the host
traefik:
ports:
- "80:80"
- "443:443"
# ✅ App is internal — reachable by Traefik via service name, not host
app:
expose:
- "3000"
# ✅ DB never touches the host network
postgres:
expose:
- "5432"
# No ports: key at all
Volumes & Storage¶
Volume Types Compared¶
| Type | Declaration | Use Case | Persistence |
|---|---|---|---|
| Named volume | volumes: top-level |
Databases, stateful apps | Survives down; removed by down -v |
| Bind mount | Absolute or relative path | Dev hot-reload, config injection | Host filesystem |
tmpfs |
tmpfs: key |
Ephemeral scratch, secrets in RAM | Gone on stop |
| Anonymous | No name, inline | Throw-away cache | Removed with container |
services:
db:
image: postgres:16
volumes:
# Named volume — recommended for databases
- pgdata:/var/lib/postgresql/data
# Bind mount — dev config injection
- ./postgres/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
# tmpfs — secrets or scratch
- type: tmpfs
target: /dev/shm
tmpfs:
size: 134217728 # 128 MB in bytes
volumes:
pgdata:
driver: local
driver_opts:
type: none
o: bind
device: /data/postgres # Map named volume to specific host path
Advanced Volume Options¶
volumes:
# NFS mount via named volume
nfs-data:
driver: local
driver_opts:
type: nfs
o: "addr=192.168.1.100,rw,nfsvers=4"
device: ":/exports/data"
# External volume (pre-created, not managed by Compose)
existing-data:
external: true
name: legacy-database-volume
Environment & Secrets¶
Environment Variables — Precedence (high → low)¶
services:
app:
image: myapp
environment:
# Literal value
NODE_ENV: production
PORT: "3000"
# Pass-through from host shell (no value = take from shell)
AWS_ACCESS_KEY_ID:
AWS_SECRET_ACCESS_KEY:
# Load from file
env_file:
- .env
- .env.local # Merged; later files override earlier ones
.env file (auto-loaded for variable substitution in compose.yaml):
# compose.yaml — uses .env values
services:
db:
image: postgres:${POSTGRES_VERSION:-15} # With fallback default
app:
image: ${REGISTRY}/app:${TAG:?TAG must be set} # Fail fast if missing
ports:
- "${APP_PORT}:3000"
Secrets¶
services:
app:
secrets:
- db_password
- api_key
# Secrets mounted at /run/secrets/<name> by default
environment:
DB_PASSWORD_FILE: /run/secrets/db_password # App reads file, not env var
secrets:
db_password:
file: ./secrets/db_password.txt # Dev: read from file
api_key:
environment: API_KEY # Source from host environment variable
Secret handling best practice
Never put secrets in environment: blocks — they appear in docker inspect output.
Use secret mounts or dedicated secret managers (Vault, AWS Secrets Manager) in production.
Health Checks & Dependencies¶
Defining Health Checks¶
services:
db:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 10s # Time between checks
timeout: 5s # Check must complete within this
retries: 5 # Failures before marking unhealthy
start_period: 30s # Grace period — failures during this don't count
start_interval: 2s # Poll rate during start_period (Compose v2.20+)
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 3
Startup Ordering with depends_on¶
services:
app:
image: myapp
depends_on:
db:
condition: service_healthy # Wait for health check to pass
restart: true # Restart app if db restarts (Compose v2.17+)
redis:
condition: service_healthy
migrations:
condition: service_completed_successfully # One-shot job must exit 0
migrations:
image: myapp
command: ["npm", "run", "migrate"]
depends_on:
db:
condition: service_healthy
db:
image: postgres:16
healthcheck: { ... }
Warning
depends_on controls startup order, not application-level readiness inside the container. Always implement retry logic in your application.
Build Configuration¶
Basic Build¶
services:
app:
build:
context: . # Build context (directory sent to daemon)
dockerfile: Dockerfile.prod
Advanced Build Args, Targets & Cache¶
services:
app:
build:
context: .
dockerfile: Dockerfile
target: production # Multi-stage target
args:
NODE_VERSION: "20"
BUILD_DATE: ${BUILD_DATE}
cache_from:
- type=registry,ref=ghcr.io/myorg/app:cache
- type=local,src=/tmp/buildkit-cache
cache_to:
- type=inline
platforms:
- linux/amd64
- linux/arm64
labels:
org.opencontainers.image.source: https://github.com/myorg/app
shm_size: "256m" # Larger /dev/shm for build (e.g., pytest)
Multi-stage Dockerfile Pattern¶
# Dockerfile
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
FROM node:20-alpine AS build
WORKDIR /app
COPY . .
RUN npm ci && npm run build
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
USER node
CMD ["node", "dist/index.js"]
Profiles & Conditional Services¶
Profiles let you define services that only start when explicitly requested — ideal for dev tools, debug containers, or optional dependencies.
services:
app:
image: myapp
# No profile — always starts
adminer:
image: adminer
profiles: [tools] # Only with: docker compose --profile tools up
ports:
- "8080:8080"
debug-proxy:
image: mitmproxy/mitmproxy
profiles: [debug]
load-test:
image: grafana/k6
profiles: [perf]
command: run /scripts/load-test.js
volumes:
- ./k6:/scripts
# Start base services
docker compose up -d
# Start with tools profile
docker compose --profile tools up -d
# Multiple profiles
docker compose --profile tools --profile debug up -d
# Via environment variable
COMPOSE_PROFILES=tools,debug docker compose up -d
Scaling & Resource Limits¶
Replicas¶
services:
worker:
image: myworker
deploy:
replicas: 4
# Note: most deploy options only apply in Swarm mode
# For standalone Compose, use --scale flag
Resource Limits (Compose Standalone)¶
services:
app:
image: myapp
deploy:
resources:
limits:
cpus: "1.5" # Max 1.5 CPU cores
memory: 512M
pids: 100 # Max process count
reservations:
cpus: "0.5" # Guaranteed minimum
memory: 256M
Note
deploy.resources works in standalone Compose (not just Swarm) as of Compose v2.
Extends, Anchors & Overrides¶
YAML Anchors (DRY config within one file)¶
x-common-logging: &logging
driver: json-file
options:
max-size: "10m"
max-file: "3"
x-app-base: &app-base
restart: unless-stopped
networks:
- backend
logging: *logging
services:
api:
<<: *app-base # Merge anchor
image: myapp/api:latest
ports:
- "3000:3000"
worker:
<<: *app-base
image: myapp/worker:latest
extends — Inherit Across Files¶
services:
app:
image: myapp
environment:
LOG_LEVEL: info
deploy:
resources:
limits:
memory: 256M
services:
app:
extends:
file: base.yaml
service: app
environment:
LOG_LEVEL: debug # Override inherited value
ports:
- "3000:3000" # Add new config
Compose File Overrides¶
# docker compose merges files in order — later files override earlier
docker compose -f compose.yaml -f compose.prod.yaml up -d
# compose.prod.yaml — only overrides what differs
services:
app:
image: myapp:${TAG:?}
environment:
NODE_ENV: production
deploy:
resources:
limits:
memory: 1G
COMPOSE_FILE Environment Variable¶
CLI Quick Reference¶
Lifecycle¶
# Start (detached), build if needed
docker compose up -d --build
# Stop containers, preserve volumes
docker compose down
# Stop AND remove volumes (destructive!)
docker compose down -v
# Restart single service without full down/up
docker compose restart app
# Remove stopped containers
docker compose rm -f
Inspection¶
# Running services and their ports
docker compose ps
# Follow logs (all services)
docker compose logs -f
# Follow logs for specific service
docker compose logs -f --tail=100 app
# Inspect resolved compose config (after variable substitution)
docker compose config
# Inspect service environment
docker compose exec app env
Running Commands¶
# Exec in running container
docker compose exec app sh
# Run one-off command in new container (inherits service config)
docker compose run --rm app npm run migrate
# Override entrypoint
docker compose run --rm --entrypoint sh app
Build & Push¶
# Build all services
docker compose build --no-cache --parallel
# Build specific service
docker compose build app
# Push all images
docker compose push
# Pull latest images
docker compose pull
Advanced¶
# Dry-run: see what would change
docker compose up --dry-run
# Wait for services to be healthy before returning
docker compose up -d --wait
# Scale specific service
docker compose up -d --scale worker=8
# Show resource usage
docker compose top
docker stats $(docker compose ps -q)
# Copy files to/from container
docker compose cp app:/app/logs ./local-logs
Production Patterns¶
Pattern 1: Reverse Proxy with Traefik¶
services:
traefik:
image: traefik:v3
command:
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.letsencrypt.acme.email=ops@example.com"
- "--certificatesresolvers.letsencrypt.acme.storage=/acme/acme.json"
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- traefik-acme:/acme
api:
image: myapp/api:${TAG}
labels:
traefik.enable: "true"
traefik.http.routers.api.rule: "Host(`api.example.com`)"
traefik.http.routers.api.entrypoints: "websecure"
traefik.http.routers.api.tls.certresolver: "letsencrypt"
traefik.http.services.api.loadbalancer.server.port: "3000"
volumes:
traefik-acme:
Pattern 2: Database with Init Scripts & Backups¶
services:
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
POSTGRES_DB: ${DB_NAME}
volumes:
- pgdata:/var/lib/postgresql/data
- ./db/init:/docker-entrypoint-initdb.d:ro # Init scripts run on first start
secrets:
- db_password
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
interval: 10s
retries: 5
start_period: 20s
pgbackup:
image: prodrigestivill/postgres-backup-local
environment:
POSTGRES_HOST: postgres
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
POSTGRES_DB: ${DB_NAME}
SCHEDULE: "@daily"
BACKUP_KEEP_DAYS: 7
volumes:
- ./backups:/backups
secrets:
- db_password
depends_on:
postgres:
condition: service_healthy
Pattern 3: Zero-Downtime Redeploy Script¶
#!/usr/bin/env bash
# deploy.sh
set -euo pipefail
TAG=${1:?Usage: deploy.sh <tag>}
export TAG
echo "Pulling new image..."
docker compose pull app
echo "Deploying with zero-downtime scale..."
docker compose up -d --scale app=2 --no-recreate
echo "Waiting for new instance to be healthy..."
sleep 15
echo "Removing old instance..."
docker compose up -d --scale app=1 --no-recreate
echo "Deploy complete. Running containers:"
docker compose ps
Pattern 4: Init Container Pattern¶
services:
# Init container — runs first, must exit 0
migrate:
image: myapp:${TAG}
command: ["npm", "run", "db:migrate"]
environment:
DATABASE_URL: ${DATABASE_URL}
depends_on:
db:
condition: service_healthy
restart: "no" # Don't restart if migrations fail — fail loudly
app:
image: myapp:${TAG}
depends_on:
migrate:
condition: service_completed_successfully
db:
condition: service_healthy
Pattern 5: Observability Stack¶
services:
app:
image: myapp
environment:
OTEL_EXPORTER_OTLP_ENDPOINT: http://otel-collector:4317
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
volumes:
- ./otel-config.yaml:/etc/otel/config.yaml
command: ["--config=/etc/otel/config.yaml"]
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus-data:/prometheus
ports:
- "127.0.0.1:9090:9090"
grafana:
image: grafana/grafana:latest
environment:
GF_SECURITY_ADMIN_PASSWORD__FILE: /run/secrets/grafana_pass
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning:ro
ports:
- "127.0.0.1:3001:3000"
secrets:
- grafana_pass
Common Gotchas¶
1. Port binding 0.0.0.0 vs 127.0.0.1¶
# ❌ Exposes port to all interfaces — dangerous in cloud VMs
ports:
- "5432:5432"
# ✅ Only accessible locally
ports:
- "127.0.0.1:5432:5432"
2. Volume mount shadows container content¶
# If ./app doesn't have node_modules, this silently removes them
volumes:
- ./app:/app # Bind mount hides container's /app/node_modules
Fix: Use an anonymous volume to "protect" node_modules:
3. .env is for Compose variable substitution, not container env¶
.env values are interpolated into compose.yaml. They are not automatically injected as container environment variables unless you reference them in environment:.
# .env: SECRET_KEY=abc123
services:
app:
# ❌ Container does NOT get SECRET_KEY automatically
image: myapp
# ✅ Explicit pass-through
environment:
SECRET_KEY: ${SECRET_KEY}
4. restart: always vs unless-stopped¶
| Policy | Restarts on | Survives docker stop |
Survives daemon restart |
|---|---|---|---|
always |
Any exit | ✅ | ✅ |
unless-stopped |
Any exit | ❌ | ✅ |
on-failure |
Non-zero exit | ❌ | Only if was running |
Use unless-stopped for most production services.
5. command overrides, entrypoint replaces¶
# ENTRYPOINT ["/bin/sh"] CMD ["default.sh"]
command: custom.sh # → /bin/sh custom.sh (CMD replaced)
entrypoint: ["/usr/bin/env"] # → /usr/bin/env (ENTRYPOINT replaced, CMD ignored)
6. Networking between docker compose projects¶
Services in different Compose projects cannot resolve each other by default. Use a shared external network:
7. depends_on does not wait for application readiness¶
The service_healthy condition waits for the Docker health check — not for your app to be actually serving traffic. Always implement retry/backoff in application startup code.
8. Build context size matters¶
Docker sends the entire context to the daemon. A missing or permissive .dockerignore makes builds slow and images large.
# .dockerignore — always include this
.git
node_modules
dist
*.log
.env*
.DS_Store
coverage
__pycache__
Quick Reference Card¶
# The 6 commands you'll use 90% of the time
docker compose up -d --build # Build + start everything
docker compose down -v # Teardown including volumes
docker compose logs -f app # Follow service logs
docker compose exec app sh # Shell into running container
docker compose run --rm app <cmd> # One-off command
docker compose ps # Service status