Redis Cache Service with Normalized Keys & TTL Strategy

What? (Concept Overview)

A cache service wraps a Redis client behind a domain-flavoured API and a normalised key format so any caller can cache.get(key) without thinking about string hashing, key prefixes, or TTL semantics. The pattern isolates three concerns: (1) connection management, (2) key derivation (deterministic, namespaced), and (3) safe-fallback when Redis is unreachable.

Project Context

The FCA Support Agent’s CacheService (app/services/cache_service.py) caches high-frequency LLM responses (e.g., FAQ queries, account-balance lookups) keyed by a SHA-style hash of the normalised query string. Normalisation lowercases + strips whitespace + collapses repeats so "How do I open an account?" and "how do i open an account" share one cache entry. The service gates on settings.redis_enabled so the app boots without Redis in dev.

How? (Quick Reference Blocks)

3.1 The Cache Service Skeleton


# app/services/cache_service.py
import json
import hashlib
import redis.asyncio as aioredis
from app.config import settings
 
class CacheService:
    DEFAULT_TTL_SECONDS = 300            # 5 minutes
 
    def __init__(self) -> None:
        self.client: aioredis.Redis | None = None
        if settings.redis_enabled:
            self.client = aioredis.from_url(
                settings.redis_url, decode_responses=True,
            )
 
    async def normalise(self, query: str) -> str:
        """Lowercase, strip, collapse internal whitespace."""
        return " ".join(query.lower().split())
 
    async def _key(self, namespace: str, query: str) -> str:
        norm = await self.normalise(query)
        digest = hashlib.sha256(norm.encode("utf-8")).hexdigest()[:16]
        return f"fca:{namespace}:{digest}"
 
    async def get(self, namespace: str, query: str) -> str | None:
        if self.client is None:
            return None
        try:
            return await self.client.get(await self._key(namespace, query))
        except aioredis.RedisError:
            return None    # never crash the request path
 
    async def set(self, namespace: str, query: str, value: str) -> None:
        if self.client is None:
            return
        try:
            await self.client.set(
                await self._key(namespace, query),
                value,
                ex=self.DEFAULT_TTL_SECONDS,
            )
        except aioredis.RedisError:
            pass    # cache failures are silent on the write path too

3.2 Use-Site: Caching an FAQ Reply


# inside an agent or service that handles FAQs
cache = CacheService()
KEY_NS = "faq"
 
async def get_faq_reply(question: str) -> str:
    cached = await cache.get(KEY_NS, question)
    if cached:
        return json.loads(cached)["answer"]
    answer = await llm.generate_faq_reply(question)   # slow path
    await cache.set(KEY_NS, question, json.dumps({"answer": answer}))
    return answer

3.3 Cache-Aside Pattern in FastAPI


# in an endpoint handler
@router.get("/products/recommendations")
async def recommendations(query: str = Query(...)):
    async with CacheService() as cache:
        cached = await cache.get("recommendations", query)
        if cached:
            return JSONResponse(json.loads(cached))
        async with ProductService() as svc:
            recs = await svc.find_recommendations(query)
            await cache.set(
                "recommendations", query, json.dumps(recs, default=str),
            )
            return recs

Why? (Parameter Breakdown

redis.asyncio (aioredis rebranded) — Native async client. Sync redis.Redis blocks the event loop; even a 5ms Redis hop matters at 1k RPS.
decode_responses=True — Returns str instead of bytes. Callers don’t have to wrap every cache.get(...) in .decode("utf-8").
Hash-based normalised key — Semantic equality: "Account Balance" and " account balance ” share a cache entry. Without normalisation, every minor variation creates a new key (cache miss). The first 16 hex chars of SHA-256 are 64 bits of collision space — adequate for cache keys.
fca:{namespace}:{digest} key prefix — Namespaces are first-class in Redis (KEYS fca:faq:*). Enables redis-cli DEL fca:faq:* for emergency flushes.
TTL of 5 minutes (300s) — Cached data older than 5 min is stale enough to risk misleading responses. Drop TTLs (cache forever) are a mistake in BFSI: regulations require fresh data.
try/except aioredis.RedisError returning None — Cache is OPT-IN; never break a request because the cache is down. The None return triggers the slow path; alternatively emit metrics.
settings.redis_enabled gate — Local dev (docker compose up redis) toggles the cache; without Redis the service is a no-op. Avoids startup failures when env vars differ from prod.

Common Pitfalls

Storing PII in cache keys. A key like fca:balance:customer_42 leaks internal IDs through Redis monitoring dashboards. Use SHA-derived keys for any sensitive namespace.
No TTL on cached entries. Without ex= the entry persists forever, surviving schema migrations and producing subtly wrong responses (“why does this customer balance look like 2024 Q1?”). Always TTL.

Real-World Interview Prep

Q1: When would you prefer a write-through cache (write DB + write Redis in the same transaction) over cache-aside?

A: Cache-aside (lazy, this page’s pattern) is best when (a) reads vastly outnumber writes (FAQ lookups), (b) staleness is acceptable for a few seconds, (c) you want eviction to be policy-driven (TTL). Write-through is best when (a) the cache MUST match the DB on read-after-write, (b) you have a strong-consistency requirement, (c) cache is mandatory for the workload. Banking balances rarely use cache-aside because of staleness risk; product metadata is fine.

Q2: How do you prevent a cache stampede (1k concurrent requests for a missing key)?

A: Add an asyncio.Lock per-key OR use the SET-NX pattern:


# Coalescing: only one process rebuilds; the rest wait briefly.
if await cache.get(...) is None:
    async with cache.coalesce(key):
        if await cache.get(...) is None:   # double-check
            value = await slow_build_value()
            await cache.set(key, value)

For multi-process stampedes use Redis itself (SET lock:key NX EX 30 → 30-second lock; losers poll for the key with bounded retries).

Q3: Why bother caching LLM responses when they’re already fast?

A: Three reasons. (1) Cost — every cached hit saves tokens. At $0.05 per 1k tokens, 1M cached FAQ hits = $50 saved. (2) Latency variance — uncached hits can spike to 5s under load. (3) Rate-limit headroom — provider rate limits are per-minute; caching shaves token-spend and stays under the cap.

Top-to-Bottom Code Walkthrough (`app/services/cache_service.py`)

This file wraps Redis so every cache call goes through a single safe-fallback entrypoint. The most important guarantee is “Redis down ≠ request down”.

`init`

self.client = redis.asyncio.Redis.from_url(...) — async client; its constructor does NOT actually connect. Connection happens on first command.
self.default_ttl = 300 — five minutes for anonymous reads.
The class never raises when Redis is disabled; is_available() just returns False.

`generate_cache_key(namespace, params)`

Builds f"{namespace}:{hashlib.sha256(json.dumps(params, sort_keys=True).encode()).hexdigest()[:16]}". The JSON-sorted-keys trick is the secret: {"a":1,"b":2} and {"b":2,"a":1} produce the same hash.
Truncating to 16 hex chars keeps keys short (Redis has a per-key memory cost).

`get(key) -> Optional[Any]`

await self.client.get(key) — returns the bytes previously stored, or None.
json.loads(value) reconstructs the original Python object.
Wrapped in try/except so any exception (ConnectionError, TimeoutError, JSONDecodeError) returns None rather than bubbling.

`set(key, value, ttl)`

await self.client.setex(key, ttl, json.dumps(value)) — setex is a single round-trip for “set with expiry”. The TTL is mandatory to prevent unbounded Redis growth.

`delete(key)` and `flush_pattern(pattern)`

delete removes a single known key (e.g. after invalidation on write).
flush_pattern uses scan_iter(match=pattern) (NOT keys(), which blocks Redis) to find and delete every key matching a glob.

Common Pitfalls

Using redis.keys(pattern) in production blocks the Redis event loop for the duration of the scan. Use redis.scan_iter(match=pattern) — it streams.

Storing pickle instead of json leaks a deserialisation vector. Anyone who can write to Redis can craft a malicious pickle. JSON keeps the surface small.

Calling client.get() on a closed connection returns cryptic ConnectionClosedError instead of None. Always wrap in try/except.

Real-World Interview Prep

Q1: Why is “Redis disabled = no-op” better than “Redis enabled = crash on outage”?

A: Cache should be invisible to the caller. If your app crashes on a Redis blip you’ve replaced one SPOF (the DB) with another (Redis). Always degrade gracefully.

Q2: What’s wrong with `dict(some_obj)` for cache keys?

A: dict(obj) only serialises top-level attributes — nested objects may be unhashable or non-deterministic. json.dumps(obj, sort_keys=True, default=str) is portable across processes and versions.

Q3: Why does `setex` beat `set(...); expire(...)`?

A: setex is one round-trip; set + expire is two. Two RTTs on a hot path doubles the latency. Atomic setex is also safe — a worker crash between set and expire would leave a key with no TTL (a slow leak).

Redis Cache Service with Normalized Keys & TTL Strategy

What? (Concept Overview)

Project Context

How? (Quick Reference Blocks)

3.1 The Cache Service Skeleton

3.2 Use-Site: Caching an FAQ Reply

3.3 Cache-Aside Pattern in FastAPI

Why? (Parameter Breakdown

Common Pitfalls

Real-World Interview Prep

Q1: When would you prefer a write-through cache (write DB + write Redis in the same transaction) over cache-aside?

Q2: How do you prevent a cache stampede (1k concurrent requests for a missing key)?

Q3: Why bother caching LLM responses when they’re already fast?

Top-to-Bottom Code Walkthrough (app/services/cache_service.py)

__init__

generate_cache_key(namespace, params)

get(key) -> Optional[Any]

set(key, value, ttl)

delete(key) and flush_pattern(pattern)

Common Pitfalls

Real-World Interview Prep

Q1: Why is “Redis disabled = no-op” better than “Redis enabled = crash on outage”?

Q2: What’s wrong with dict(some_obj) for cache keys?

Q3: Why does setex beat set(...); expire(...)?

Top-to-Bottom Code Walkthrough (`app/services/cache_service.py`)

`init`

`generate_cache_key(namespace, params)`

`get(key) -> Optional[Any]`

`set(key, value, ttl)`

`delete(key)` and `flush_pattern(pattern)`

Q2: What’s wrong with `dict(some_obj)` for cache keys?

Q3: Why does `setex` beat `set(...); expire(...)`?