Skip to Content
DevopsDocker Compose Multi-Service Orchestration

Docker Compose Multi-Service Orchestration

What

A docker-compose stack whose services gate their startup on explicit service_healthy dependencies so Postgres comes up before the API, Redis comes up before the worker, and the worker comes up before the frontend. Volumes bind the source tree into both web and worker so changes propagate live without a rebuild.

Project Context

In full_project_context_updated.txt -> ./docker-compose.yml, the stack is db (pgvector), redis (cache + Celery broker/backend), web (FastAPI), worker (Celery), frontend (Streamlit). Postgres uses pgvector/pgvector:pg15 to ship the pgvector extension pre-installed. Both web and worker mount - .:/app so a code edit on the host is live in both containers without rebuilding. The worker ignores the healthcheck (disable: true) because Celery’s worker process has no HTTP port to ping.

How

Service definitions, in dependency order

services: db: image: pgvector/pgvector:pg15 healthcheck: test: ["CMD-SHELL", "pg_isready -U fca_user -d fca_support"] interval: 10s timeout: 5s retries: 5 start_period: 10s redis: image: redis:7-alpine command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru healthcheck: test: ["CMD", "redis-cli", "ping"] web: depends_on: db: { condition: service_healthy } redis: { condition: service_healthy } volumes: - .:/app # sync python code into container - ./alembic:/app/alembic # sync migrations - ./tests:/app/tests command: uvicorn app.main:app --host 0.0.0.0 --port 8000 worker: depends_on: - db - redis volumes: - .:/app command: celery -A app.worker.celery_app worker --loglevel=info healthcheck: disable: true
  • condition: service_healthy is required (not just service_started) so the app waits until pg_isready returns 0. Without condition: service_healthy the API can start before Postgres is ready and crash with connection refused.
  • allkeys-lru is the right eviction policy for a Celery broker — it ensures long-tail jobs don’t get starved by hot keys.
  • Mounting .:/app in both web and worker saves you a rebuild every code edit, but you must restart the relevant process inside the container (uvicorn for FastAPI works because of --reload; Celery workers do NOT reload, so push a docker compose restart worker after model edits).
  • disable: true on worker.healthcheck is mandatory because Celery exposes no HTTP endpoint — leaving the default curl http://localhost/ would mark the worker unhealthy within seconds of startup.

Common Pitfalls

depends_on without condition: service_healthy marks the dependency as met the moment the container is created, not when the service is actually serving traffic. Always use condition: service_healthy for stateful services like Postgres and Redis.

PYTHONUNBUFFERED=1 not set in compose env causes Celery’s stdout to buffer so log lines mysteriously vanish for 4 KB at a time. Always set it.

Real-World Interview Prep

Q1: When would you migrate from docker compose to Kubernetes?

A: Three signals. (1) You need rolling deploys / zero-downtime (docker compose up -d always kills-then-starts). (2) You need auto-scaling on CPU/memory/custom-metrics. (3) You have multiple services that need NetworkPolicies (e.g., worker should not have ingress). Stay on compose when you have one host, one stack, no scaling needs, and short-lived dev environments. The migration cost is real — Helm charts, RBAC, secrets management all need to be set up. For the FCA stack on a single VM, compose is fine; on 5+ VMs with HPA, K8s.

Q2: What’s the difference between depends_on with and without condition: service_healthy?

A: Without condition, depends_on only ensures the container is created. Postgres might still be initializing when the API tries to connect, so the app crashes with connection refused. condition: service_healthy waits until the dependency’s healthcheck.test returns 0 — i.e., Postgres is actually accepting connections. The healthcheck block on the dependency defines what “healthy” means; without it Compose has no signal to wait on. The gotcha: if your healthcheck pollutes the dependent service’s logs, migrate to a TCP-level probe (nc -z) instead of an HTTP GET.

Q3: When should you volume-mount source code (- .:/app) vs build it into the image?

A: Mount source code when (a) you’re in local development and want live reload, (b) you need the dev container to share machine-local state (gitignored files, IDE artifacts). Build into the image when (a) you’re shipping to production — runtime containers must NOT depend on host filesystem layout, (b) you need reproducible builds (every image layer is immutable). Many teams transition via the “synced code in dev, distroless image in prod” pattern: docker-compose.yml for dev mounts the host tree, the CI pipeline builds a multi-stage image with COPY . baked in. Don’t ship mount-based code to prod — the host path is meaningless inside Kubernetes pods.

Top-to-Bottom Code Walkthrough (docker-compose.yml)

This file is the single source of truth for the local stack. Every service is a container on a shared bridge network — service names (web, db, redis) become hostnames for inter-service communication.

Top-level structure

version: '3.8' services: db: ... redis: ... web: ... worker: ... frontend: ... volumes: postgres_data: ... redis_data: ... fca-logs: ... networks: fca-network: driver: bridge
  • version: '3.8' — Compose file format.
  • services: — one block per container.
  • volumes: — named volumes persist data outside containers.
  • networks: — defines a custom bridge so containers can talk to each other by name.

db — Postgres + pgvector

db: image: pgvector/pgvector:pg15 container_name: fca-postgres restart: unless-stopped environment: POSTGRES_USER: fca_user POSTGRES_PASSWORD: fca_password POSTGRES_DB: fca_support ports: - "5433:5432" volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U fca_user -d fca_support"] interval: 10s timeout: 5s retries: 5

Image pgvector/pgvector:pg15 is PostgreSQL 15 with the vector extension pre-compiled — saving you 10+ minutes of apt install postgresql-server-dev-15 pain. POSTGRES_INITDB_ARGS forces UTF-8 for emoji-laden categories.

Port 5433:5432 — host sees the DB on port 5433 (avoids clashes with local Postgres on 5432). The container still listens on 5432 internally; only the ports: mapping changes the host-side port.

Healthcheckpg_isready returns 0 if the DB accepts connections. The web and worker services depends_on: condition: service_healthy — they won’t start until DB-health is green.

redis — cache + Celery broker

redis: image: redis:7-alpine command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
  • --appendonly yes — AOF persistence; Redis writes every change to disk.
  • --maxmemory 256mb — bounds memory.
  • --maxmemory-policy allkeys-lru — when memory fills, evict least-recently-used.

web — FastAPI

web: image: ghcr.io/davidsandeep1996-spec/fca-multi-agent-support/fca-app:latest ports: - "8000:8000" environment: DATABASE_URL: postgresql+asyncpg://fca_user:fca_password@db:5432/fca_support REDIS_URL: redis://redis:6379/0 depends_on: db: { condition: service_healthy } redis: { condition: service_healthy } volumes: - .:/app # bind-mount local source for live reload - ./alembic:/app/alembic - ./tests:/app/tests

db:5432 — the service name db becomes the hostname; PostgreSQL accepts connections on its internal port 5432. Bind-mount . : /app — your local source code is mounted at /app inside the container. Edits appear instantly; uvicorn --reload re-runs on save.

worker — Celery

Same image, different command:

worker: image: ghcr.io/davidsandeep1996-spec/fca-multi-agent-support/fca-app:latest command: celery -A app.worker.celery_app worker --loglevel=info healthcheck: disable: true

Healthcheck disabled because Celery has no /health endpoint; Docker shouldn’t mark it unhealthy just because something else is wrong.

frontend — Streamlit

frontend: image: ghcr.io/.../fca-frontend:latest ports: - "8501:8501" environment: - API_BASE_URL=http://web:8000/api/v1

The Streamlit container doesn’t talk to Postgres directly — only via the web API.

Common Pitfalls

Two containers binding port 8000 — Docker Compose allows it but only one wins. Pick distinct container-side ports.

Hardcoding localhost in a service’s env var — fails inside Docker because localhost is the container itself, not the host. Use the service name (db, redis, web).

Forgetting depends_on: condition: service_healthy — the web container starts before the DB has loaded its schema, fails immediately, restarts in a loop.

Real-World Interview Prep

Q1: Why a custom bridge network instead of the default?

A: The default bridge network on older Docker versions blocks DNS resolution between containers. Modern Compose creates a custom bridge automatically, where service names resolve to container IPs via Docker’s embedded DNS (127.0.0.11).

Q2: How do you reproduce a production bug locally?

A: Pin the same image tag in image: (don’t use :latest), copy env vars from prod, and run with docker compose up. If the bug is data-driven, restore a prod backup into the postgres_data volume.

Q3: Why bind-mount source code instead of baking it into the image?

A: Speed. With a bind-mount and uvicorn --reload, code changes appear in <1 second. Without it, you’d rebuild the image (3+ minutes) on every edit. Production deploys use a baked image; dev uses bind-mount.

Last updated on