Essential Docker Commands: The Complete Cheat Sheet
Key Takeaways
- →`docker system prune -a --volumes` prevents 2 AM disk exhaustion outages — 47GB of dangling layers accumulated silently until checkout went down
- →Tag every image with explicit versions (`nginx:1.25-alpine`), never `:latest` — latest is a moving target that breaks rollbacks when old images are garbage-collected
- →Multi-stage Dockerfile builds reduce final image size by 80–90% — compile in alpine-based builder, copy only the binary into `FROM scratch`
- →Named volumes persist data across container restarts; bind mounts use host paths for dev; tmpfs for secrets in memory — each has opposite safety properties
- →User-defined networks auto-resolve container names by IP — `docker run --network myapp-network --name api` makes `ping db` work across containers
It was 2 AM on a Saturday. The checkout service returned 500s.
kubectl logsshowedno space left on device. The culprit? Overlay2 storage had accumulated 47GB of dangling layers from months of CI/CD builds. Nobody had rundocker system prune. The fix took 30 seconds. Finding it took 90 minutes.
Master container operations, image management, volumes, networking, and the disk cleanup command that prevents 3 AM outages. All 50 essential Docker commands, organized by task.
- Use tables and quick command blocks for instant lookup and copy-paste
- Always tag images with versions — never
:latestin production docker system prune -a --volumesprevents disk exhaustion before it kills production
Pick the Right Command for the Symptom
Most "I forgot the docker command for X" lookups boil down to a small decision tree. Route by symptom, not by concept:
graph TD
Start[What do I need?] --> Type{Container,<br/>image, volume,<br/>or network?}
Type -->|Container| Cstate{Running or<br/>stopped?}
Cstate -->|Running| Crun[ps · exec · logs · stats]
Cstate -->|Stopped| Cstop[ps -a · start · rm]
Type -->|Image| Iaction{Build, list,<br/>or clean?}
Iaction -->|Build| Ibuild[build · tag · push]
Iaction -->|List| Ilist[images · history · inspect]
Iaction -->|Clean| Iclean[image prune · system prune]
Type -->|Volume| Vol[volume ls · volume inspect · volume rm]
Type -->|Network| Net[network ls · network inspect · network connect]
style Crun fill:#dfd
style Ibuild fill:#dfd
style Vol fill:#dfd
style Net fill:#dfd
The diagram is the cheat-sheet within the cheat-sheet — every command in the rest of this file lives at a leaf of that tree.
The Quick Start
[Docker docs]| Task | Command |
|---|---|
| Run | docker run -d --name myapp -p 8080:8080 nginx:1.25 |
| List running | docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" |
| List all | docker ps -a |
| Stop | docker stop myapp |
| Start | docker start myapp |
| Restart | docker restart myapp |
| Shell into | docker exec -it myapp /bin/bash |
| Run one command | docker exec myapp curl http://localhost:8080/health |
| Copy from container | docker cp myapp:/var/log/app.log ./debug/ |
| Remove | docker rm myapp |
Images and Builds
[Docker docs]## Build from Dockerfile
docker build -t myapp:v1.0 .
## Build with build args
docker build --build-arg NODE_ENV=production -t myapp:prod .
## Pull specific version (always use tags, never :latest)
docker pull postgres:16-alpine
## Tag and push to registry
docker tag myapp:v1.0 registry.company.com/myapp:v1.0
docker push registry.company.com/myapp:v1.0
## Check image disk usage
docker system df
## Remove dangling images
docker image prune -f
## Remove all unused images
docker image prune -a -fRegistry login: docker login (Docker Hub), aws ecr get-login-password | docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com (ECR), or gcloud auth configure-docker (GCP).
Dockerfile Basics
[Docker docs]## Multi-stage build: fast rebuilds, tiny final image
FROM golang:1.23-bookworm AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /server ./cmd/server
FROM scratch
COPY --from=builder /server /server
EXPOSE 8080
ENTRYPOINT ["/server"]Layer caching rule: Put rarely-changing instructions (RUN apt-get install) first, frequently-changing ones (COPY . .) last. Docker reuses cached layers if nothing changed in that layer or before it.
The image lifecycle in one picture — build → push → pull → run → exit:
graph LR
DF[Dockerfile + source] -->|docker build| Img[Image<br/>layered SHA256<br/>local cache]
Img -->|docker push| Reg[(Registry<br/>ECR / GCR / Hub)]
Reg -->|docker pull| Node[Node-local cache<br/>per-host image store]
Node -->|docker run<br/>or kubelet| Cont[Container<br/>writable top layer<br/>+ ephemeral fs]
Cont -->|docker exec| Shell[Interactive shell<br/>or one-shot command]
Cont -->|exit / SIGKILL| Stopped[Stopped container<br/>filesystem preserved]
Stopped -->|docker rm| Gone[Gone — top layer deleted]
Stopped -->|docker start| Cont
Cont -.->|docker commit<br/>rare, manual| Img
Cont -.->|volume mount| Vol[(Named volume<br/>survives container)]
style Cont fill:#dfd
style Reg fill:#ffd
style Vol fill:#ffd
The diagram captures every command's place in the lifecycle: build produces an Image, run instantiates a Container, exec enters a running container, rm deletes the writable layer (the volume survives).
## Inspect layer history
docker image history myapp:v1.0
## Get full metadata
docker image inspect myapp:v1.0Volumes and Storage
[Docker docs]| Type | Syntax | Use Case |
|---|---|---|
| Named | docker run -v mydata:/data myapp | Database storage, persistent app data |
| Bind mount | docker run -v /host/path:/container/path myapp | Source code (dev), config files |
| Read-only | docker run -v myapp-config:/etc/myapp:ro myapp | Immutable config, certificates |
| tmpfs | docker run --tmpfs /tmp myapp | Secrets in memory, scratch space |
## Create named volume
docker volume create myapp-data
## List and inspect
docker volume ls
docker volume inspect myapp-data
## Clean up unused volumes
docker volume prune -f
## Backup a volume
docker run --rm -v myapp-data:/source:ro -v $(pwd):/backup alpine \
tar czf /backup/backup.tar.gz -C /source .Networking
[Docker docs]| Driver | Use Case | DNS Resolution | Multi-Host |
|---|---|---|---|
bridge (default) | Single-host container-to-container | Yes (user-defined networks) | No |
host | Maximum network performance | N/A | N/A |
overlay | Swarm / multi-host networking | Yes | Yes |
none | Complete network isolation | No | No |
## Create custom bridge network
docker network create myapp-network
## Containers on same network resolve by name
docker run -d --name api --network myapp-network myapp:v1.0
docker run -d --name db --network myapp-network postgres:16
## Inside api: ping db → resolves to postgres IP
## Port mapping variants
docker run -p 8080:80 nginx # host:container
docker run -p 127.0.0.1:8080:80 nginx # localhost only
docker run -p 8080:80/udp nginx # UDP portDebugging and Logs
[Docker docs]## View logs with timestamps
docker logs --details --timestamps myapp
## Follow logs (like tail -f)
docker logs -f --tail=100 myapp
## Inspect container state and config
docker inspect myapp | jq '.[0] | {State, Config: {Env}, HostConfig: {Memory}}'
## Shell into container
docker exec -it myapp /bin/bash
## Run command without interactive shell
docker exec myapp curl http://localhost:8080/health
## Monitor CPU/memory usage
docker stats --no-stream
## Watch events in real time
docker events --filter type=containerProduction Cleanup
## Check disk usage (images, containers, volumes, build cache)
docker system df -v
## The lifesaver: remove dangling layers, old containers, unused images
docker system prune -a --volumes --filter "until=24h"
## Run as daily cron job on build servers to prevent disk exhaustionWhen to use what
| Use Case | Recommended | Why | Avoid When |
|---|---|---|---|
| Multi-stage build | FROM golang AS builder ... FROM scratch | Fast rebuilds, tiny final images (distroless), no runtime bloat | Single-stage Dockerfiles with all deps in final image |
| Base image for Go/Rust | distroless or alpine:3.19 | Minimal CVE surface; Go/Rust are compiled so no runtime needed | Node.js/Python — need runtime installed |
| Base image for Node/Python | node:22-alpine or python:3.12-slim | Small, has runtime, fewer CVEs than ubuntu | Using latest or bookworm when alpine works |
| Copy strategy | COPY for files, ADD only for .tar.gz | COPY is explicit; ADD auto-extracts but hides side effects | Mixing COPY and ADD in same layer for clarity |
| Container startup | ENTRYPOINT ["app"] + CMD ["arg1"] | ENTRYPOINT defines the executable; CMD overrides args only | Using CMD for the binary and args together |
| Port mapping range | docker run -p 8080:80 | Accessible externally (0.0.0.0); standard in production | docker run -p 127.0.0.1:8080:80 unless localhost-only intended |
| Persistent data | Named volumes docker run -v mydata:/data | Survives container deletion; easy backup/restore | Bind mounts for app data (lose data on host deletion) |
| Config/secrets | ConfigMaps in Kubernetes; env vars for Docker | Environment decouples config from image | Hardcoding in Dockerfile or .env files |
| Layer caching | Put stable deps first, code last | Reuses layers; fast iterative rebuilds | Putting COPY . . before RUN apt-get |
Gotchas that bite in production
-
docker system prune -aremoves ALL unused images, including base layers you'll rebuild- You run it Friday evening:
docker system prune -a --volumes. Monday morning, every build is 5 minutes slower because it's rebuilding base layers from scratch. - Fix: Use
docker image prune(no-a) to remove dangling images only. Rundocker system prune -a --filter "until=72h"on build servers during off-peak to preserve recent layers.
- You run it Friday evening:
-
docker run -d appwith no healthcheck dies silently, deployments continue- Container crashes immediately (OOM, permission denied, missing binary). Docker marks it as exited. Orchestration thinks it's fine because no restart policy is set. Your service serves 503s for hours.
- Fix: Always set
--health-cmdor defineHEALTHCHECKin Dockerfile. Pair with--restart=on-failure:3in docker-compose to catch startup failures immediately.
-
COPY . .beforeRUN npm installinvalidates layer cache on every code change- You edit one line of code. Build restarts from "RUN npm install" (cache miss) and rebuilds 500MB of node_modules. Takes 3 minutes instead of 5 seconds.
- Fix: Separate dependency layer from code layer:
COPY package.json package-lock.json ./ RUN npm ci COPY . . RUN npm run build
-
docker execon a stopped container silently does nothing- You're debugging a crashed container, think you ran a command to check logs, but the container was stopped. You start troubleshooting the wrong thing.
- Fix: Always check container state first:
docker inspect myapp | grep State. Usedocker logs myapp --previousfor crash dumps instead.
-
Image pulls timeout in production during peak traffic
- New deployment triggers and pulls a 2GB image. Registry is slow, pull times out after 2 minutes. Replica stays pending. Load moves to remaining pods, they get OOMKilled. Cascading failure.
- Fix: Always
docker pullimage to all nodes before deployment (warming the cache). Use multi-registry (Quay + ECR fallback) for redundancy. Setpull_policy: alwaysin your compose file to ensure fresh images on deploy.
Common Gotchas
-
Always specify image versions — never
:latestin production.docker pull postgres:latestgets a different image next month. -
Build args are not secrets —
ARG GITHUB_TOKEN=...is visible indocker image history. Use BuildKit secret mounts instead:RUN --mount=type=secret,id=token npm ci. -
--privilegedis root access — never use in production. It gives the container full access to the host's devices. Debug locally only. -
docker rmdoesn't remove images — usedocker rmi myapp:v1.0. Removing images doesn't clean the build cache. -
Dangling layers kill disk space —
docker system prune -a --volumesremoves untagged images and build cache. Set up as daily cron job on build servers. -
Port mapping to localhost doesn't broadcast —
docker run -p 127.0.0.1:8080:80 nginxlistens on localhost only.docker run -p 8080:80 nginxlistens on 0.0.0.0 (externally accessible).
Docker image security: the SBOM workflow
Audits and supply-chain attacks have made image provenance non-negotiable. The standard pipeline is build → SBOM → scan → sign → verify, and each step has a one-line tool. An SBOM (Software Bill of Materials) inventories every package, version, and license inside the image; a vulnerability scanner cross-references it against CVE databases; a signature proves the image came from your build pipeline and has not been swapped at the registry.
Generate an SBOM with Syft, scan it with Grype, then sign with cosign and a keyless OIDC identity provided by Sigstore. The same workflow runs in CI (block the push on critical CVEs) and at admission time in Kubernetes (reject unsigned images via Kyverno or Connaisseur).
## 1. Generate SBOM in CycloneDX format (also supports SPDX, syft-json)
syft myapp:v1.0 -o cyclonedx-json=sbom.json
## 2. Scan the SBOM, fail the build on Critical / High CVEs
grype sbom:sbom.json --fail-on high --only-fixed
## 3. Sign the image keyless via Sigstore (uses GitHub OIDC token in CI)
cosign sign --yes registry.company.com/myapp:v1.0
## 4. Attach the SBOM as an attestation to the image
cosign attest --yes --predicate sbom.json \
--type cyclonedx registry.company.com/myapp:v1.0
## 5. At deploy time, verify signature + SBOM attestation
cosign verify \
--certificate-identity-regexp "https://github.com/company/.*" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
registry.company.com/myapp:v1.0The keyless flow eliminates the long-lived signing key that previously had to live in a vault. The OIDC token from GitHub Actions (or any compliant CI) is exchanged for a short-lived certificate at Sigstore's Fulcio, the signature is recorded in Rekor's transparency log, and verifiers check both. If an attacker pushes a malicious image without the matching OIDC identity, cosign verify fails and the deployment is rejected.
Pin the base image by digest, not tag: FROM gcr.io/distroless/base@sha256:abcd... instead of FROM gcr.io/distroless/base:latest. Tags are mutable; digests are not. Combined with cosign verification, this closes the registry-poisoning attack vector entirely.
Production Dockerfile patterns
Three patterns separate hobbyist Dockerfiles from production ones: distroless final stages, BuildKit cache mounts, and --mount=type=secret for build-time credentials. Each cuts image size, build time, or attack surface — and they compose.
## syntax=docker/dockerfile:1.7
## Rust example: cache mount keeps cargo registry warm across builds
FROM rust:1.83-bookworm AS builder
WORKDIR /app
COPY Cargo.toml Cargo.lock ./
COPY src ./src
RUN --mount=type=cache,target=/usr/local/cargo/registry \
--mount=type=cache,target=/app/target \
cargo build --release --locked && \
cp target/release/server /server
## Distroless final stage — no shell, no package manager, no CVE surface
FROM gcr.io/distroless/cc-debian12:nonroot
COPY --from=builder /server /server
USER nonroot:nonroot
EXPOSE 8080
ENTRYPOINT ["/server"]The cache mount survives across builds without bloating the image (it lives in BuildKit's cache, not in any layer). For a Rust monorepo, this turns five-minute compiles into thirty-second incremental builds.
For Java services, the analogous pattern uses Maven's repository cache plus a JLink-trimmed JRE in the runtime stage:
## syntax=docker/dockerfile:1.7
FROM maven:3.9-eclipse-temurin-21 AS builder
WORKDIR /app
COPY pom.xml ./
RUN --mount=type=cache,target=/root/.m2 mvn -B dependency:go-offline
COPY src ./src
RUN --mount=type=cache,target=/root/.m2 \
mvn -B -DskipTests package && \
jlink --add-modules java.base,java.logging,java.sql,java.naming \
--strip-debug --no-header-files --no-man-pages \
--output /jre
FROM gcr.io/distroless/java-base-debian12:nonroot
COPY --from=builder /jre /jre
COPY --from=builder /app/target/app.jar /app.jar
USER nonroot:nonroot
ENTRYPOINT ["/jre/bin/java", "-jar", "/app.jar"]Build-time secrets: never ARG GITHUB_TOKEN — it lands in image history. Use --mount=type=secret:
DOCKER_BUILDKIT=1 docker build \
--secret id=npmrc,src=$HOME/.npmrc \
-t myapp:v1.0 .Inside the Dockerfile: RUN --mount=type=secret,id=npmrc,target=/root/.npmrc npm ci. The secret is mounted only for that RUN step and never written to a layer.
Container debugging in production
When a container misbehaves in production, the diagnostic flow is roughly: check liveness, then resource pressure, then in-process state, then kernel-level signals. Each step has one or two commands and rules out a class of failure before you escalate.
## 1. Is the container actually alive? Did it OOM-kill recently?
docker inspect --format \
'{{.State.Status}} exit={{.State.ExitCode}} oom={{.State.OOMKilled}} pid={{.State.Pid}}' \
myapp
## 2. Resource pressure — CPU throttling and memory headroom
docker stats --no-stream --format \
"table {{.Name}}\t{{.CPUPerc}}\t{{.MemPerc}}\t{{.MemUsage}}\t{{.NetIO}}"
## 3. What is the container actually doing? PID + threads inside namespace
docker top myapp -o pid,ppid,pcpu,pmem,etime,stat,wchan,cmd
## 4. Real-time event stream — restarts, OOMs, health-check transitions
docker events --filter container=myapp \
--filter event=die --filter event=oom --filter event=health_status
## 5. When exec fails because the container has no shell (distroless)
docker run -it --rm --pid=container:myapp --net=container:myapp \
--cap-add=SYS_PTRACE nicolaka/netshootThe last command is the workhorse for distroless images: it joins the target container's PID and network namespaces using a debug image that has tcpdump, strace, dig, curl, nc, and lsof. From inside it, strace -p 1 attaches to the application's PID 1 to see what syscalls it is blocked on, and ss -tnlp lists every listening socket without modifying the production image.
One concrete diagnostic flow for "service returns 503 intermittently": run docker stats to rule out CPU throttling, run docker top to confirm the worker count matches expectations, run docker exec myapp curl -s localhost:8080/metrics | grep http_requests to read in-process counters, and finally docker events --since 10m to spot whether the orchestrator killed and restarted a replica during the window. Most production debugging never needs to leave these four commands — the discipline is running them in order rather than guessing.
Frequently Asked Questions
How do I run a container with environment variables?
Use -e KEY=value (or -e KEY to read from the host environment). Example: docker run -e DATABASE_URL=postgres://db:5432/myapp -e DEBUG=true myapp:v1.0.
How do I mount a volume at a specific path?
docker run -v /host/path:/container/path myapp for bind mounts. Use -v mydata:/data for named volumes. Append :ro for read-only: docker run -v config:/etc/app:ro myapp.
How do I see if a container is using too much memory?
Run docker stats --no-stream for live CPU/memory across all containers. To inspect a single container's limit: docker inspect --format '{{.HostConfig.Memory}}' myapp (returns bytes; 0 means unlimited).
What's the difference between docker stop and docker kill?
docker stop sends SIGTERM for graceful shutdown, waits 10 seconds, then escalates to SIGKILL. docker kill sends SIGKILL immediately. Always use stop in production so in-flight requests can drain.
How do I copy a file out of a crashed container?
docker cp myapp:/var/log/app.log ./debug/ works on stopped containers. If you need to explore the filesystem, create a debug image: docker commit myapp debug-myapp && docker run -it debug-myapp sh.
Can I run commands inside a container without a shell?
Yes. docker exec myapp curl http://localhost:8080/health runs curl directly without spawning a shell. Reserve docker exec -it for cases where you actually need interactive input.
Production Checklist
- Images tagged with specific versions, never
:latest -
docker system prune -a --volumes --filter "until=24h"runs daily on build servers - Containers run as non-root USER in Dockerfile
- Health checks configured in compose files or run commands
- Memory limits set:
docker run --memory="512m" myapp - Volumes for persistent data (database, logs, caches)
- Private registry configured for sensitive images
-
.dockerignoreexcludes .git, node_modules, *.md, .env files - Multi-stage Dockerfile with build stage and minimal final stage
- Build process scanned for CVEs before pushing:
docker scout cves myapp:v1.0 --exit-code
Keep Reading
- Essential Kubernetes Commands Cheat Sheet — The
kubectlcommands for deploying, debugging, and managing the containers Docker builds - Essential Linux Commands Cheat Sheet — Process management, networking, and disk commands you will use when debugging inside containers
- Graceful Shutdown in Production Go Services — Handling SIGTERM correctly so
docker stopdrains in-flight requests instead of dropping them
Engineering Team
A multidisciplinary team of backend engineers, architects, and DevOps practitioners shipping deep dives into distributed systems and production infrastructure.
Read Next
Essential Git Commands: The Complete Developer Cheat Sheet
Production Git reference: core workflow, branching, history rewriting, recovery, and advanced operations compressed into lookup tables.
Terraform in Production: Modules, State Management, and CI/CD Patterns
Terraform in production: state locking, module design, environment directories, and CI/CD guardrails that prevent resource destruction.
Essential Kubernetes Commands: The Complete kubectl Cheat Sheet
Definitive kubectl reference: pod debugging, deployments, StatefulSets, RBAC, scheduling, Helm, and production troubleshooting flowcharts.