GitHub Actions CI/CD — Lint, Type, Test, Seed, Deploy

What? (Concept Overview)

A CI/CD pipeline gates every PR on lint + type-check + tests against a real Postgres + Redis service container and ships every merge to OCI Container Registry on GitHub, then triggers a remote redeploy over SSH. Convention over configuration: maintainers review GitHub Actions YAML, not bespoke shell scripts. The pipeline is reproducible because service containers create ephemeral dependencies on every run.

Project Context

The FCA Support Agent ships two workflows:

.github/workflows/ci.yml — lint (Ruff), type-check (mypy), test (pytest with coverage) gated on a Postgres + Redis service container
.github/workflows/deploy.yml — Build + push web/frontend container images to GHCR, then SSH into the target server to pull and restart

How? (Quick Reference Blocks)

3.1 CI Workflow Skeleton


# .github/workflows/ci.yml
name: CI
 
on:
  pull_request:
    branches: [main]
  push:
    branches: [main]
 
jobs:
  test:
    runs-on: ubuntu-latest
 
    services:
      postgres:
        image: pgvector/pgvector:pg15
        env:
          POSTGRES_USER: fca_user
          POSTGRES_PASSWORD: fca_password
          POSTGRES_DB: fca_test
        ports: ["5432:5432"]
        options: >-
          --health-cmd "pg_isready -U fca_user"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
      redis:
        image: redis:7-alpine
        ports: ["6379:6379"]
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
 
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: {python-version: "3.12"}
      - run: pip install -r requirements.txt
      - run: ruff check .                 # lint
      - run: mypy app                     # type-check
      - run: pytest -v --cov=app
        env:
          DATABASE_URL: postgresql+asyncpg://fca_user:fca_password@localhost:5432/fca_test
          REDIS_URL:    redis://localhost:6379/0

3.2 Deploy Workflow — Build & Push Multi-Image


# .github/workflows/deploy.yml
name: Deploy
 
on:
  push:
    branches: [main]
 
jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
 
    steps:
      - uses: actions/checkout@v4
 
      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
 
      - name: Build & push web image
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: ghcr.io/${{ github.repository_owner }}/fca-multi-agent-support/fca-app:latest
 
      - name: Build & push frontend image
        uses: docker/build-push-action@v6
        with:
          context: ./frontend
          push: true
          tags: ghcr.io/${{ github.repository_owner }}/fca-multi-agent-support/fca-frontend:latest

3.3 Trigger Remote Redeploy Over SSH


# .github/workflows/deploy.yml — tail
      - name: Trigger remote redeploy
        uses: appleboy/ssh-action@v1
        with:
          host:    ${{ secrets.DEPLOY_HOST }}
          username: ${{ secrets.DEPLOY_SSH_USER }}
          key:     ${{ secrets.DEPLOY_SSH_KEY }}
          script: |
            cd ~/fca-multi-agent-support
            docker compose pull
            docker compose up -d
            docker image prune -f

Why? (Parameter Breakdown

Service containers --health-cmd — GitHub Actions supports healthcheck-based gating; the runner waits until Postgres (pg_isready) and Redis (redis-cli ping) report ready before running tests. Without this, tests race pip install against DB startup and produce flaky failures.
postgres+asyncpg URL inside CI — Mirrors prod; tests that use a sync URL hide async-specific bugs (asyncpg.OperationalError: too many connections).
pip install -r requirements.txt instead of pip install -e . — The lockfile (pip-compile → requirements.txt) is the source of truth for prod; PRs must NOT introduce unconstrained deps. Use -e only for packages that change alongside this repo.
Multi-image mono-build in deploy — Two builds (web + frontend) emit separately because they have different Dockerfiles and different consumers. Single-image deploy would bundle Streamlit + FastAPI needlessly; multi-image deploys auto-scale them independently.
docker compose pull && up — Always pull so the local cache doesn’t serve a stale image tag. The prune -f cleans dangling images so disk usage doesn’t grow unbounded on long-lived deploy hosts.
SSH trigger (not kubernetes manifest edit) — Simplest deploy mechanism for a single-VM setup. Skip it when you scale: a GitHub-Actions → Helm/ArgoCD pipeline is the multi-cluster upgrade.

Common Pitfalls

Caching pip without a key that invalidates on requirements.txt change. Pipe requirements.txt hash into actions/cache@v4 key; otherwise stale cache ships broken installs in CI.
Running tests without pytest-asyncio mode declared. Async tests in pytest need @pytest.mark.asyncio or asyncio: auto in pyproject.toml. CI fails with “coroutine was never awaited” without it.

Real-World Interview Prep

Q1: How would you migrate this single-VM deploy pipeline to Kubernetes?

A: Three migrations. (1) Replace SSH deploy step with kubectl apply -f k8s/ after build. (2) Move secrets to Kubernetes secrets bound from a sealed-secrets controller (Bitnami sealed-secrets, External Secrets Operator). (3) Replace docker compose pull with kubectl set image deployment/fca-web fca-web=<new-tag> --record for rolling updates. Add a canary step that ships to 10% of pods via Argo Rollouts and waits for the SLO.

Q2: Your CI is slow. First three things you’d profile?

A: (1) pip install time → cache with actions/setup-python@v5 + cache: 'pip'. (2) Postgres fixture boot → reuse the service container across jobs. (3) Test discovery → split unit/integration tests; run unit tests on every PR, integration tests only on main. Beyond that, parallel matrix builds by Python version.

Q3: Why `appleboy/ssh-action` rather than `fabric` or `ansible`?

A: Native GitHub Actions integration; no Python runtime on the runner required; secret injection is built-in. fabric/ansible would require installing Python dependencies on the runner and managing credential storage separately. The trade-off: appleboy/ssh-action is great for one-line deploy scripts, but for complex multi-step orchestration move to Ansible or Pulumi.

Top-to-Bottom Code Walkthrough (`.github/workflows/ci.yml` + `.github/workflows/deploy.yml`)

CI/CD is the last line of defence. No PR merges without green tests; no production deploy without a passing CI.

`ci.yml` — triggered on push and pull_request


name: CI
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: pgvector/pgvector:pg15
        env:
          POSTGRES_USER: fca_user
          POSTGRES_PASSWORD: fca_password
          POSTGRES_DB: fca_test
        ports:
          - 5432:5432
        options: >-
          --health-cmd "pg_isready -U fca_user"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379

GitHub Actions service containers spin up Postgres and Redis as sidecars. The CI test runner connects to them via localhost:5432 and localhost:6379 — same URLs the app uses.

Steps


steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
  with:
    python-version: '3.11'
    cache: 'pip'
- name: Install dependencies
  run: pip install -r requirements.txt
- name: Lint
  run: |
    black --check app/
    isort --check app/
    flake8 app/
- name: Type-check
  run: mypy app/
- name: Test
  env:
    DATABASE_URL: postgresql+asyncpg://fca_user:fca_password@localhost:5432/fca_test
    REDIS_URL: redis://localhost:6379/0
  run: pytest --cov=app --cov-report=xml
- name: Upload coverage
  uses: codecov/codecov-action@v4

actions/checkout@v4 — clone the repo.
actions/setup-python@v5 — install Python 3.11 + cache pip packages for speed.
Lint step — black --check, isort --check, flake8 enforce style.
Type-check — mypy app/ static analysis.
Test step — runs pytest with coverage; env vars point at the service containers.
Coverage upload — pushes the coverage.xml artifact to codecov.io.

What blocks a PR

Any failure stops the merge button. Coverage drop below threshold fails the build.

`deploy.yml` — triggered on tag push


name: Deploy
on:
  push:
    tags: ['v*.*.*']
jobs:
  build-and-push:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Build Docker image
      run: docker build -t fca-app:${{ github.ref_name }} .
    - name: Push to GHCR
      run: |
        echo ${{ secrets.GITHUB_TOKEN }} | docker login ghcr.io -u ${{ github.actor }} --password-stdin
        docker push ghcr.io/davidsandeep1996-spec/fca-multi-agent-support/fca-app:${{ github.ref_name }}
  deploy:
    needs: build-and-push
    runs-on: ubuntu-latest
    steps:
    - name: SSH to VPS
      uses: appleboy/ssh-action@v1
      with:
        host: ${{ secrets.VPS_HOST }}
        username: ${{ secrets.VPS_USER }}
        key: ${{ secrets.VPS_SSH_KEY }}
        script: |
          cd /opt/fca
          git pull
          docker compose pull
          docker compose up -d

Tag-triggered deploy — pushing v1.2.3 builds the image and sshes to the VPS to run docker compose pull && up -d. Manual deploys are still possible via the Actions UI.

Common Pitfalls

Caching secrets in workflow logs — the secrets.* reference is automatically redacted; never echo $GITHUB_TOKEN plain.

Skipping actions/setup-python cache: 'pip' — without cache, install takes 3 minutes instead of 30 seconds. Set cache key on requirements.txt hash.

Allowing merges with failing tests — branch protection on main requires ci.yml to pass. Don’t disable without a written exception.

Real-World Interview Prep

Q1: Why service containers instead of mocking databases?

A: Service containers run the real database. Mocking might miss connection-pool issues, race conditions, or SQL syntax errors. Real Postgres-in-CI gives high confidence the test passes in production too.

Q2: What’s the trade-off between GHCR (free, public) and private registries?

A: GHCR is free for public images. For private BFSI workloads, use AWS ECR / GCP Artifact Registry. Costs are minor but VPC isolation matters for compliance.

Q3: How do you roll back a bad deploy?

A: Re-tag the previous working image as :latest and ssh into the VPS: docker compose pull && docker compose up -d. Add a “rollback” job in Actions that’s a one-click button.

GitHub Actions CI/CD — Lint, Type, Test, Seed, Deploy

What? (Concept Overview)

Project Context

How? (Quick Reference Blocks)

3.1 CI Workflow Skeleton

3.2 Deploy Workflow — Build & Push Multi-Image

3.3 Trigger Remote Redeploy Over SSH

Why? (Parameter Breakdown

Common Pitfalls

Real-World Interview Prep

Q1: How would you migrate this single-VM deploy pipeline to Kubernetes?

Q2: Your CI is slow. First three things you’d profile?

Q3: Why appleboy/ssh-action rather than fabric or ansible?

Top-to-Bottom Code Walkthrough (.github/workflows/ci.yml + .github/workflows/deploy.yml)

ci.yml — triggered on push and pull_request

Steps

What blocks a PR

deploy.yml — triggered on tag push

Common Pitfalls

Real-World Interview Prep

Q1: Why service containers instead of mocking databases?

Q2: What’s the trade-off between GHCR (free, public) and private registries?

Q3: How do you roll back a bad deploy?

Q3: Why `appleboy/ssh-action` rather than `fabric` or `ansible`?

Top-to-Bottom Code Walkthrough (`.github/workflows/ci.yml` + `.github/workflows/deploy.yml`)

`ci.yml` — triggered on push and pull_request

`deploy.yml` — triggered on tag push