Docker Compose Multi-Service Orchestration
What
A docker-compose stack whose services gate their startup on explicit service_healthy dependencies so Postgres comes up before the API, Redis comes up before the worker, and the worker comes up before the frontend. Volumes bind the source tree into both web and worker so changes propagate live without a rebuild.
Project Context
In full_project_context_updated.txt -> ./docker-compose.yml, the stack is db (pgvector), redis (cache + Celery broker/backend), web (FastAPI), worker (Celery), frontend (Streamlit). Postgres uses pgvector/pgvector:pg15 to ship the pgvector extension pre-installed. Both web and worker mount - .:/app so a code edit on the host is live in both containers without rebuilding. The worker ignores the healthcheck (disable: true) because Celery’s worker process has no HTTP port to ping.
How
Service definitions, in dependency order
services:
db:
image: pgvector/pgvector:pg15
healthcheck:
test: ["CMD-SHELL", "pg_isready -U fca_user -d fca_support"]
interval: 10s
timeout: 5s
retries: 5
start_period: 10s
redis:
image: redis:7-alpine
command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
healthcheck:
test: ["CMD", "redis-cli", "ping"]
web:
depends_on:
db: { condition: service_healthy }
redis: { condition: service_healthy }
volumes:
- .:/app # sync python code into container
- ./alembic:/app/alembic # sync migrations
- ./tests:/app/tests
command: uvicorn app.main:app --host 0.0.0.0 --port 8000
worker:
depends_on:
- db
- redis
volumes:
- .:/app
command: celery -A app.worker.celery_app worker --loglevel=info
healthcheck:
disable: truecondition: service_healthyis required (not justservice_started) so the app waits untilpg_isreadyreturns 0. Withoutcondition: service_healthythe API can start before Postgres is ready and crash withconnection refused.allkeys-lruis the right eviction policy for a Celery broker — it ensures long-tail jobs don’t get starved by hot keys.- Mounting
.:/appin bothwebandworkersaves you a rebuild every code edit, but you must restart the relevant process inside the container (uvicornfor FastAPI works because of--reload; Celery workers do NOT reload, so push adocker compose restart workerafter model edits). disable: trueonworker.healthcheckis mandatory because Celery exposes no HTTP endpoint — leaving the defaultcurl http://localhost/would mark the worker unhealthy within seconds of startup.
Common Pitfalls
depends_on without condition: service_healthy marks the dependency as met the moment the container is created, not when the service is actually serving traffic. Always use condition: service_healthy for stateful services like Postgres and Redis.
PYTHONUNBUFFERED=1 not set in compose env causes Celery’s stdout to buffer so log lines mysteriously vanish for 4 KB at a time. Always set it.
Real-World Interview Prep
Q1: When would you migrate from docker compose to Kubernetes?
A: Three signals. (1) You need rolling deploys / zero-downtime (docker compose up -d always kills-then-starts). (2) You need auto-scaling on CPU/memory/custom-metrics. (3) You have multiple services that need NetworkPolicies (e.g., worker should not have ingress). Stay on compose when you have one host, one stack, no scaling needs, and short-lived dev environments. The migration cost is real — Helm charts, RBAC, secrets management all need to be set up. For the FCA stack on a single VM, compose is fine; on 5+ VMs with HPA, K8s.
Q2: What’s the difference between depends_on with and without condition: service_healthy?
A: Without condition, depends_on only ensures the container is created. Postgres might still be initializing when the API tries to connect, so the app crashes with connection refused. condition: service_healthy waits until the dependency’s healthcheck.test returns 0 — i.e., Postgres is actually accepting connections. The healthcheck block on the dependency defines what “healthy” means; without it Compose has no signal to wait on. The gotcha: if your healthcheck pollutes the dependent service’s logs, migrate to a TCP-level probe (nc -z) instead of an HTTP GET.
Q3: When should you volume-mount source code (- .:/app) vs build it into the image?
A: Mount source code when (a) you’re in local development and want live reload, (b) you need the dev container to share machine-local state (gitignored files, IDE artifacts). Build into the image when (a) you’re shipping to production — runtime containers must NOT depend on host filesystem layout, (b) you need reproducible builds (every image layer is immutable). Many teams transition via the “synced code in dev, distroless image in prod” pattern: docker-compose.yml for dev mounts the host tree, the CI pipeline builds a multi-stage image with COPY . baked in. Don’t ship mount-based code to prod — the host path is meaningless inside Kubernetes pods.
Top-to-Bottom Code Walkthrough (docker-compose.yml)
This file is the single source of truth for the local stack. Every service is a container on a shared bridge network — service names (web, db, redis) become hostnames for inter-service communication.
Top-level structure
version: '3.8'
services:
db: ...
redis: ...
web: ...
worker: ...
frontend: ...
volumes:
postgres_data: ...
redis_data: ...
fca-logs: ...
networks:
fca-network:
driver: bridgeversion: '3.8'— Compose file format.services:— one block per container.volumes:— named volumes persist data outside containers.networks:— defines a custom bridge so containers can talk to each other by name.
db — Postgres + pgvector
db:
image: pgvector/pgvector:pg15
container_name: fca-postgres
restart: unless-stopped
environment:
POSTGRES_USER: fca_user
POSTGRES_PASSWORD: fca_password
POSTGRES_DB: fca_support
ports:
- "5433:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U fca_user -d fca_support"]
interval: 10s
timeout: 5s
retries: 5Image pgvector/pgvector:pg15 is PostgreSQL 15 with the vector extension pre-compiled — saving you 10+ minutes of apt install postgresql-server-dev-15 pain. POSTGRES_INITDB_ARGS forces UTF-8 for emoji-laden categories.
Port 5433:5432 — host sees the DB on port 5433 (avoids clashes with local Postgres on 5432). The container still listens on 5432 internally; only the ports: mapping changes the host-side port.
Healthcheck — pg_isready returns 0 if the DB accepts connections. The web and worker services depends_on: condition: service_healthy — they won’t start until DB-health is green.
redis — cache + Celery broker
redis:
image: redis:7-alpine
command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru--appendonly yes— AOF persistence; Redis writes every change to disk.--maxmemory 256mb— bounds memory.--maxmemory-policy allkeys-lru— when memory fills, evict least-recently-used.
web — FastAPI
web:
image: ghcr.io/davidsandeep1996-spec/fca-multi-agent-support/fca-app:latest
ports:
- "8000:8000"
environment:
DATABASE_URL: postgresql+asyncpg://fca_user:fca_password@db:5432/fca_support
REDIS_URL: redis://redis:6379/0
depends_on:
db: { condition: service_healthy }
redis: { condition: service_healthy }
volumes:
- .:/app # bind-mount local source for live reload
- ./alembic:/app/alembic
- ./tests:/app/testsdb:5432 — the service name db becomes the hostname; PostgreSQL accepts connections on its internal port 5432.
Bind-mount . : /app — your local source code is mounted at /app inside the container. Edits appear instantly; uvicorn --reload re-runs on save.
worker — Celery
Same image, different command:
worker:
image: ghcr.io/davidsandeep1996-spec/fca-multi-agent-support/fca-app:latest
command: celery -A app.worker.celery_app worker --loglevel=info
healthcheck:
disable: trueHealthcheck disabled because Celery has no /health endpoint; Docker shouldn’t mark it unhealthy just because something else is wrong.
frontend — Streamlit
frontend:
image: ghcr.io/.../fca-frontend:latest
ports:
- "8501:8501"
environment:
- API_BASE_URL=http://web:8000/api/v1The Streamlit container doesn’t talk to Postgres directly — only via the web API.
Common Pitfalls
Two containers binding port 8000 — Docker Compose allows it but only one wins. Pick distinct container-side ports.
Hardcoding localhost in a service’s env var — fails inside Docker because localhost is the container itself, not the host. Use the service name (db, redis, web).
Forgetting depends_on: condition: service_healthy — the web container starts before the DB has loaded its schema, fails immediately, restarts in a loop.
Real-World Interview Prep
Q1: Why a custom bridge network instead of the default?
A: The default bridge network on older Docker versions blocks DNS resolution between containers. Modern Compose creates a custom bridge automatically, where service names resolve to container IPs via Docker’s embedded DNS (127.0.0.11).
Q2: How do you reproduce a production bug locally?
A: Pin the same image tag in image: (don’t use :latest), copy env vars from prod, and run with docker compose up. If the bug is data-driven, restore a prod backup into the postgres_data volume.
Q3: Why bind-mount source code instead of baking it into the image?
A: Speed. With a bind-mount and uvicorn --reload, code changes appear in <1 second. Without it, you’d rebuild the image (3+ minutes) on every edit. Production deploys use a baked image; dev uses bind-mount.