FastAPI `lifespan` Startup / Shutdown with `asynccontextmanager`

What? (Concept Overview)

The FastAPI lifespan parameter (@asynccontextmanager) replaces the deprecated @app.on_event("startup"/"shutdown") pattern with a single async generator that runs ONCE per worker, before the first request and after the last. It is the supported, type-safe way to wire up DB pool warmup, observability flags, model downloads, and graceful shutdown.

Project Context

The FCA Support Agent’s app/main.py uses lifespan to:

Log observability status (is_observability_enabled from Settings)
Conditionally run init_db() in development (skip in prod where Alembic owns migrations)
Run close_db() on shutdown so connections return cleanly to the pool

Wiring this in lifespan rather than if __name__ == "__main__" guarantees the lifecycle runs under uvicorn workers AND under pytest’s TestClient.

How? (Quick Reference Blocks)

3.1 The Lifespan Function


# app/main.py
from contextlib import asynccontextmanager
from fastapi import FastAPI
import logging
 
from app.config import settings
from app.logger import setup_logging
from app.database import init_db, close_db
 
logger = logging.getLogger(__name__)
 
@asynccontextmanager
async def lifespan(app: FastAPI):
    # ============ STARTUP ============
    logger.info(f"Starting {settings.app_name} v{settings.app_version}")
    logger.info(f"Environment: {settings.environment}")
    logger.info(f"Debug mode: {settings.debug}")
 
    if settings.is_observability_enabled:
        logger.info("🔭 Observability: ENABLED (Langfuse)")
    else:
        logger.info("🔭 Observability: DISABLED (Missing keys)")
 
    logger.info("📊 Prometheus Metrics: ENABLED at /metrics")
 
    if settings.is_development:
        logger.info("Initializing database tables (development mode)")
        try:
            await init_db()
        except Exception as e:
            logger.error(f"Database initialization failed: {e}", exc_info=True)
 
    logger.info("Application startup complete")
 
    yield  # Application runs in here
 
    # ============ SHUTDOWN ============
    logger.info("Shutting down application")
    try:
        await close_db()
    except Exception as e:
        logger.error(f"Error closing database: {e}", exc_info=True)
    logger.info("Application shutdown complete")

3.2 Wiring It Into `create_application()`


# app/main.py
def create_application() -> FastAPI:
    return FastAPI(
        title=settings.app_name,
        version=settings.app_version,
        description="FCA-compliant multi-agent AI support system",
        # Critical: pass the lifespan generator to FastAPI
        lifespan=lifespan,
        docs_url="/docs" if settings.debug else None,
        redoc_url="/redoc" if settings.debug else None,
        openapi_url="/openapi.json" if settings.debug else None,
    )
 
app = create_application()

Why? (Parameter Breakdown

@asynccontextmanager instead of two @app.on_event decorators — The old pattern has no unified context for sharing state between startup and shutdown, and both run as fire-and-forget functions; failures in startup were easy to miss.
yield as the pivot — Anything before yield is “startup”; anything after is “shutdown”. FastAPI guarantees the post-yield block runs even on SIGTERM. There is no equivalent guarantee in the old on_event pair.
init_db() gated on settings.is_development — Running DDL on every prod startup creates migration drift between deploys. Let Alembic own prod schema; keep init_db() for first-boot dev convenience.
Try/except in startup AND shutdown — Startup errors should NOT prevent the app from serving at all if the dependency is non-critical (e.g., logging). Log + continue. Shutdown errors are mostly cleanup; log + swallow since the process is exiting anyway.
await close_db() runs engine.dispose() — Drains the SQLAlchemy pool, returns connections to Postgres. Without it, uvicorn workers SIGKILL on rolling deploy, leaving idle connections that pollute pg_stat_activity and can starve the cluster.

Top-to-Bottom Code Walkthrough (`app/main.py` — `lifespan`)

The lifespan in app/main.py is a single @asynccontextmanager-decorated async generator that runs once per worker, before the first request and after the last. It’s the seam between the static process world (DB pool, logging, observability) and the long-running request-handling loop.

@asynccontextmanager is the magic. It turns a generator function into an async context manager that FastAPI can pause at yield and resume on shutdown. The decorator is mandatory — without it, FastAPI cannot introspect the function as a startup/shutdown pair. Imports: asynccontextmanager from contextlib, FastAPI’s FastAPI, logging, and the three lifecycle-aware dependencies: settings from app.config, setup_logging from app.logger, and init_db + close_db from app.database.

The function signature async def lifespan(app: FastAPI) accepts the FastAPI instance as a parameter. FastAPI passes its own app object in — useful for advanced cases (e.g., storing app-scoped state via app.state), though this implementation doesn’t use it.

logger.info(f"Starting {settings.app_name} v{settings.app_version}") is the first startup log line. It outputs the configured name + version from Settings — operators immediately see which deployment they’re running.

logger.info(f"Environment: {settings.environment}") and logger.info(f"Debug mode: {settings.debug}") round out the boot banner. These two lines are the human-grep-able markers every operator learns to scan first when investigating an incident.

The observability gate: if settings.is_observability_enabled: checks whether langfuse_public_key AND langfuse_secret_key are both set (see langfuse-llm-tracing-decorator). If true, logger.info("🔭 Observability: ENABLED (Langfuse)"); else "🔭 Observability: DISABLED (Missing keys)". The emoji + uppercase makes it easy to grep — a single grep "OBSERVABILITY" against the worker’s stdout answers “is Langfuse on?” without inspecting env.

logger.info("📊 Prometheus Metrics: ENABLED at /metrics") is unconditional because Prometheus instrumentation is installed directly on the app object inside create_application() (the Instrumentator.instrument(app).expose(app, ...) call) — it doesn’t depend on env. This log line confirms “/metrics is up”, useful for confirming the scrape target on first deploy.

if settings.is_development: gates init_db() on dev mode. Going down: logger.info("Initializing database tables (development mode)") precedes the call; the inner try: await init_db() except Exception as e: logger.error(f"Database initialization failed: {e}", exc_info=True) uses exc_info=True to capture the traceback in the JSON log backend. Critically, this does NOT propagate the exception — if init_db() fails on dev start-up, the worker keeps running so someone can hit /health and see what’s wrong instead of staring at a refused connection.

logger.info("Application startup complete") is the marker FastAPI’s reverse-proxies and orchestration systems look for as “ready to accept traffic”. Kubernetes-style readiness probes can grep this line in logs.

yield is the pivot. The function pauses here; FastAPI runs the request loop. When the function resumes (via finally machinery on shutdown), everything after yield runs.

In the shutdown section, the symmetric prod trace begins: logger.info("Shutting down application") is the human-grep-able marker. try: await close_db() except Exception as e: logger.error(...) calls close_db() which inside is await engine.dispose() — drains the SQLAlchemy pool, closes open TCP connections, frees Postgres backend slots. Wrapped in try/except because the process is exiting anyway and an exception while disposing is the rare case where you simply log and move on.

logger.info("Application shutdown complete") is the last line. On real graceful shutdown, both startup-complete and shutdown-complete lines should appear in the worker logs; absence of shutdown-complete on a SIGKILL (e.g., Kubernetes eviction under pressure) is a tell-tale sign of incomplete cleanup.

The wiring side (in create_application()): the lifespan is passed to FastAPI(lifespan=lifespan, ...). Other key args: title=settings.app_name, version=settings.app_version, description=..., docs_url="/docs" if settings.debug else None — disable docs in production to avoid leaking internal endpoint shapes; same for redoc_url="/redoc" and openapi_url="/openapi.json". lifespan=lifespan is the single call that registers the start-up/shutdown pair with the framework. Without it, no startup runs and no shutdown runs.

Common Pitfalls

Doing I/O work BEFORE yield without a timeout. A hanging init_db() (e.g., DB not ready) blocks worker startup indefinitely; orchestration systems mark the pod unhealthy. Wrap startup awaits in asyncio.wait_for(...) with a 30s budget, mirroring the checkpointer.setup() pattern from the Checkpointing page.
Raising from the post-yield block. FastAPI only logs shutdown errors, it cannot meaningfully handle them — exit code is still 0. Use logger.error(..., exc_info=True) and let Kubernetes restart you if state is corrupted.

Real-World Interview Prep

Q1: Why did FastAPI deprecate `@app.on_event` in favour of `lifespan`?

A: Three reasons. (1) Single source of truth — startup and shutdown live in one function, so shared state (a connection pool, a warmup cache) doesn’t need module-level globals. (2) Async-correctness — on_event callbacks were sync-only, so any DB init had to spawn an event loop manually. lifespan is natively async. (3) Exception semantics — old API treated startup/shutdown as fire-and-forget errors; lifespan propagates startup exceptions to the worker startup hook, allowing orchestrators to mark the pod “not ready” instead of routing traffic to a half-initialised app.

Q2: Your pod is healthy in Kubernetes but `/health` returns 503. Where do you start?

A: Walk three layers. (1) kubectl logs <pod> — does the Application startup complete log appear? If not, lifespan startup hung (DB unreachable, missing migration — check exc_info=True lines). (2) If startup DID complete, hit /health directly with kubectl exec — does check_db_connection() return False? Then the pool is full or DB is down. (3) Inspect /metrics for the pool stats — sqlalchemy_pool_size vs sqlalchemy_pool_checked_out. Bumping pool_size without raising the Postgres max_connections is the most common cause.

Q3: How do you gracefully drain in-flight SSE / streaming responses during a rolling deploy?

A: lifespan shutdown blocks until the post-yield runs, but FastAPI’s HTTP server will keep accepting new requests until the listener stops. Cancel-friendly pattern: in the SSE handler, try/finally around the generator and check an asyncio.Event set on shutdown — if set, break out of the loop. Alternatively use Kubernetes’ preStop hook with a sleep 10 before sending SIGTERM, plus a terminationGracePeriodSeconds: 30. Tune to your slowest possible response tail; 95th percentile + 5s is a safe bet.

FastAPI lifespan Startup / Shutdown with asynccontextmanager