Redis Cache Service with Normalized Keys & TTL Strategy
What? (Concept Overview)
A cache service wraps a Redis client behind a domain-flavoured API and a normalised key format so any caller can cache.get(key) without thinking about string hashing, key prefixes, or TTL semantics. The pattern isolates three concerns: (1) connection management, (2) key derivation (deterministic, namespaced), and (3) safe-fallback when Redis is unreachable.
Project Context
The FCA Support Agent’s CacheService (app/services/cache_service.py) caches high-frequency LLM responses (e.g., FAQ queries, account-balance lookups) keyed by a SHA-style hash of the normalised query string. Normalisation lowercases + strips whitespace + collapses repeats so "How do I open an account?" and "how do i open an account" share one cache entry. The service gates on settings.redis_enabled so the app boots without Redis in dev.
How? (Quick Reference Blocks)
3.1 The Cache Service Skeleton
# app/services/cache_service.py
import json
import hashlib
import redis.asyncio as aioredis
from app.config import settings
class CacheService:
DEFAULT_TTL_SECONDS = 300 # 5 minutes
def __init__(self) -> None:
self.client: aioredis.Redis | None = None
if settings.redis_enabled:
self.client = aioredis.from_url(
settings.redis_url, decode_responses=True,
)
async def normalise(self, query: str) -> str:
"""Lowercase, strip, collapse internal whitespace."""
return " ".join(query.lower().split())
async def _key(self, namespace: str, query: str) -> str:
norm = await self.normalise(query)
digest = hashlib.sha256(norm.encode("utf-8")).hexdigest()[:16]
return f"fca:{namespace}:{digest}"
async def get(self, namespace: str, query: str) -> str | None:
if self.client is None:
return None
try:
return await self.client.get(await self._key(namespace, query))
except aioredis.RedisError:
return None # never crash the request path
async def set(self, namespace: str, query: str, value: str) -> None:
if self.client is None:
return
try:
await self.client.set(
await self._key(namespace, query),
value,
ex=self.DEFAULT_TTL_SECONDS,
)
except aioredis.RedisError:
pass # cache failures are silent on the write path too3.2 Use-Site: Caching an FAQ Reply
# inside an agent or service that handles FAQs
cache = CacheService()
KEY_NS = "faq"
async def get_faq_reply(question: str) -> str:
cached = await cache.get(KEY_NS, question)
if cached:
return json.loads(cached)["answer"]
answer = await llm.generate_faq_reply(question) # slow path
await cache.set(KEY_NS, question, json.dumps({"answer": answer}))
return answer3.3 Cache-Aside Pattern in FastAPI
# in an endpoint handler
@router.get("/products/recommendations")
async def recommendations(query: str = Query(...)):
async with CacheService() as cache:
cached = await cache.get("recommendations", query)
if cached:
return JSONResponse(json.loads(cached))
async with ProductService() as svc:
recs = await svc.find_recommendations(query)
await cache.set(
"recommendations", query, json.dumps(recs, default=str),
)
return recsWhy? (Parameter Breakdown
redis.asyncio(aioredis rebranded) — Native async client. Syncredis.Redisblocks the event loop; even a 5ms Redis hop matters at 1k RPS.decode_responses=True— Returnsstrinstead ofbytes. Callers don’t have to wrap everycache.get(...)in.decode("utf-8").- Hash-based normalised key — Semantic equality:
"Account Balance"and" account balance” share a cache entry. Without normalisation, every minor variation creates a new key (cache miss). The first 16 hex chars of SHA-256 are 64 bits of collision space — adequate for cache keys. fca:{namespace}:{digest}key prefix — Namespaces are first-class in Redis (KEYS fca:faq:*). Enablesredis-cli DEL fca:faq:*for emergency flushes.- TTL of 5 minutes (300s) — Cached data older than 5 min is stale enough to risk misleading responses. Drop TTLs (cache forever) are a mistake in BFSI: regulations require fresh data.
try/except aioredis.RedisErrorreturningNone— Cache is OPT-IN; never break a request because the cache is down. TheNonereturn triggers the slow path; alternatively emit metrics.settings.redis_enabledgate — Local dev (docker compose up redis) toggles the cache; without Redis the service is a no-op. Avoids startup failures when env vars differ from prod.
Common Pitfalls
- Storing PII in cache keys. A key like
fca:balance:customer_42leaks internal IDs through Redis monitoring dashboards. Use SHA-derived keys for any sensitive namespace. - No TTL on cached entries. Without
ex=the entry persists forever, surviving schema migrations and producing subtly wrong responses (“why does this customer balance look like 2024 Q1?”). Always TTL.
Real-World Interview Prep
Q1: When would you prefer a write-through cache (write DB + write Redis in the same transaction) over cache-aside?
A: Cache-aside (lazy, this page’s pattern) is best when (a) reads vastly outnumber writes (FAQ lookups), (b) staleness is acceptable for a few seconds, (c) you want eviction to be policy-driven (TTL). Write-through is best when (a) the cache MUST match the DB on read-after-write, (b) you have a strong-consistency requirement, (c) cache is mandatory for the workload. Banking balances rarely use cache-aside because of staleness risk; product metadata is fine.
Q2: How do you prevent a cache stampede (1k concurrent requests for a missing key)?
A: Add an asyncio.Lock per-key OR use the SET-NX pattern:
# Coalescing: only one process rebuilds; the rest wait briefly.
if await cache.get(...) is None:
async with cache.coalesce(key):
if await cache.get(...) is None: # double-check
value = await slow_build_value()
await cache.set(key, value)For multi-process stampedes use Redis itself (SET lock:key NX EX 30 → 30-second lock; losers poll for the key with bounded retries).
Q3: Why bother caching LLM responses when they’re already fast?
A: Three reasons. (1) Cost — every cached hit saves tokens. At $0.05 per 1k tokens, 1M cached FAQ hits = $50 saved. (2) Latency variance — uncached hits can spike to 5s under load. (3) Rate-limit headroom — provider rate limits are per-minute; caching shaves token-spend and stays under the cap.
Top-to-Bottom Code Walkthrough (app/services/cache_service.py)
This file wraps Redis so every cache call goes through a single safe-fallback entrypoint. The most important guarantee is “Redis down ≠ request down”.
__init__
self.client = redis.asyncio.Redis.from_url(...)— async client; its constructor does NOT actually connect. Connection happens on first command.self.default_ttl = 300— five minutes for anonymous reads.- The class never raises when Redis is disabled;
is_available()just returnsFalse.
generate_cache_key(namespace, params)
- Builds
f"{namespace}:{hashlib.sha256(json.dumps(params, sort_keys=True).encode()).hexdigest()[:16]}". The JSON-sorted-keys trick is the secret:{"a":1,"b":2}and{"b":2,"a":1}produce the same hash. - Truncating to 16 hex chars keeps keys short (Redis has a per-key memory cost).
get(key) -> Optional[Any]
await self.client.get(key)— returns the bytes previously stored, or None.json.loads(value)reconstructs the original Python object.- Wrapped in
try/exceptso any exception (ConnectionError,TimeoutError,JSONDecodeError) returnsNonerather than bubbling.
set(key, value, ttl)
await self.client.setex(key, ttl, json.dumps(value))—setexis a single round-trip for “set with expiry”. The TTL is mandatory to prevent unbounded Redis growth.
delete(key) and flush_pattern(pattern)
deleteremoves a single known key (e.g. after invalidation on write).flush_patternusesscan_iter(match=pattern)(NOTkeys(), which blocks Redis) to find and delete every key matching a glob.
Common Pitfalls
Using redis.keys(pattern) in production blocks the Redis event loop for the duration of the scan. Use redis.scan_iter(match=pattern) — it streams.
Storing pickle instead of json leaks a deserialisation vector. Anyone who can write to Redis can craft a malicious pickle. JSON keeps the surface small.
Calling client.get() on a closed connection returns cryptic ConnectionClosedError instead of None. Always wrap in try/except.
Real-World Interview Prep
Q1: Why is “Redis disabled = no-op” better than “Redis enabled = crash on outage”?
A: Cache should be invisible to the caller. If your app crashes on a Redis blip you’ve replaced one SPOF (the DB) with another (Redis). Always degrade gracefully.
Q2: What’s wrong with dict(some_obj) for cache keys?
A: dict(obj) only serialises top-level attributes — nested objects may be unhashable or non-deterministic. json.dumps(obj, sort_keys=True, default=str) is portable across processes and versions.
Q3: Why does setex beat set(...); expire(...)?
A: setex is one round-trip; set + expire is two. Two RTTs on a hot path doubles the latency. Atomic setex is also safe — a worker crash between set and expire would leave a key with no TTL (a slow leak).