Skip to Content
BackendRedis Cache Service with Normalized Keys & TTL Strategy

Redis Cache Service with Normalized Keys & TTL Strategy

What? (Concept Overview)

A cache service wraps a Redis client behind a domain-flavoured API and a normalised key format so any caller can cache.get(key) without thinking about string hashing, key prefixes, or TTL semantics. The pattern isolates three concerns: (1) connection management, (2) key derivation (deterministic, namespaced), and (3) safe-fallback when Redis is unreachable.

Project Context

The FCA Support Agent’s CacheService (app/services/cache_service.py) caches high-frequency LLM responses (e.g., FAQ queries, account-balance lookups) keyed by a SHA-style hash of the normalised query string. Normalisation lowercases + strips whitespace + collapses repeats so "How do I open an account?" and "how do i open an account" share one cache entry. The service gates on settings.redis_enabled so the app boots without Redis in dev.

How? (Quick Reference Blocks)

3.1 The Cache Service Skeleton

# app/services/cache_service.py import json import hashlib import redis.asyncio as aioredis from app.config import settings class CacheService: DEFAULT_TTL_SECONDS = 300 # 5 minutes def __init__(self) -> None: self.client: aioredis.Redis | None = None if settings.redis_enabled: self.client = aioredis.from_url( settings.redis_url, decode_responses=True, ) async def normalise(self, query: str) -> str: """Lowercase, strip, collapse internal whitespace.""" return " ".join(query.lower().split()) async def _key(self, namespace: str, query: str) -> str: norm = await self.normalise(query) digest = hashlib.sha256(norm.encode("utf-8")).hexdigest()[:16] return f"fca:{namespace}:{digest}" async def get(self, namespace: str, query: str) -> str | None: if self.client is None: return None try: return await self.client.get(await self._key(namespace, query)) except aioredis.RedisError: return None # never crash the request path async def set(self, namespace: str, query: str, value: str) -> None: if self.client is None: return try: await self.client.set( await self._key(namespace, query), value, ex=self.DEFAULT_TTL_SECONDS, ) except aioredis.RedisError: pass # cache failures are silent on the write path too

3.2 Use-Site: Caching an FAQ Reply

# inside an agent or service that handles FAQs cache = CacheService() KEY_NS = "faq" async def get_faq_reply(question: str) -> str: cached = await cache.get(KEY_NS, question) if cached: return json.loads(cached)["answer"] answer = await llm.generate_faq_reply(question) # slow path await cache.set(KEY_NS, question, json.dumps({"answer": answer})) return answer

3.3 Cache-Aside Pattern in FastAPI

# in an endpoint handler @router.get("/products/recommendations") async def recommendations(query: str = Query(...)): async with CacheService() as cache: cached = await cache.get("recommendations", query) if cached: return JSONResponse(json.loads(cached)) async with ProductService() as svc: recs = await svc.find_recommendations(query) await cache.set( "recommendations", query, json.dumps(recs, default=str), ) return recs

Why? (Parameter Breakdown

  • redis.asyncio (aioredis rebranded) — Native async client. Sync redis.Redis blocks the event loop; even a 5ms Redis hop matters at 1k RPS.
  • decode_responses=True — Returns str instead of bytes. Callers don’t have to wrap every cache.get(...) in .decode("utf-8").
  • Hash-based normalised key — Semantic equality: "Account Balance" and " account balance ” share a cache entry. Without normalisation, every minor variation creates a new key (cache miss). The first 16 hex chars of SHA-256 are 64 bits of collision space — adequate for cache keys.
  • fca:{namespace}:{digest} key prefix — Namespaces are first-class in Redis (KEYS fca:faq:*). Enables redis-cli DEL fca:faq:* for emergency flushes.
  • TTL of 5 minutes (300s) — Cached data older than 5 min is stale enough to risk misleading responses. Drop TTLs (cache forever) are a mistake in BFSI: regulations require fresh data.
  • try/except aioredis.RedisError returning None — Cache is OPT-IN; never break a request because the cache is down. The None return triggers the slow path; alternatively emit metrics.
  • settings.redis_enabled gate — Local dev (docker compose up redis) toggles the cache; without Redis the service is a no-op. Avoids startup failures when env vars differ from prod.

Common Pitfalls

  1. Storing PII in cache keys. A key like fca:balance:customer_42 leaks internal IDs through Redis monitoring dashboards. Use SHA-derived keys for any sensitive namespace.
  2. No TTL on cached entries. Without ex= the entry persists forever, surviving schema migrations and producing subtly wrong responses (“why does this customer balance look like 2024 Q1?”). Always TTL.

Real-World Interview Prep

Q1: When would you prefer a write-through cache (write DB + write Redis in the same transaction) over cache-aside?

A: Cache-aside (lazy, this page’s pattern) is best when (a) reads vastly outnumber writes (FAQ lookups), (b) staleness is acceptable for a few seconds, (c) you want eviction to be policy-driven (TTL). Write-through is best when (a) the cache MUST match the DB on read-after-write, (b) you have a strong-consistency requirement, (c) cache is mandatory for the workload. Banking balances rarely use cache-aside because of staleness risk; product metadata is fine.

Q2: How do you prevent a cache stampede (1k concurrent requests for a missing key)?

A: Add an asyncio.Lock per-key OR use the SET-NX pattern:

# Coalescing: only one process rebuilds; the rest wait briefly. if await cache.get(...) is None: async with cache.coalesce(key): if await cache.get(...) is None: # double-check value = await slow_build_value() await cache.set(key, value)

For multi-process stampedes use Redis itself (SET lock:key NX EX 30 → 30-second lock; losers poll for the key with bounded retries).

Q3: Why bother caching LLM responses when they’re already fast?

A: Three reasons. (1) Cost — every cached hit saves tokens. At $0.05 per 1k tokens, 1M cached FAQ hits = $50 saved. (2) Latency variance — uncached hits can spike to 5s under load. (3) Rate-limit headroom — provider rate limits are per-minute; caching shaves token-spend and stays under the cap.

Top-to-Bottom Code Walkthrough (app/services/cache_service.py)

This file wraps Redis so every cache call goes through a single safe-fallback entrypoint. The most important guarantee is “Redis down ≠ request down”.

__init__

  • self.client = redis.asyncio.Redis.from_url(...) — async client; its constructor does NOT actually connect. Connection happens on first command.
  • self.default_ttl = 300 — five minutes for anonymous reads.
  • The class never raises when Redis is disabled; is_available() just returns False.

generate_cache_key(namespace, params)

  • Builds f"{namespace}:{hashlib.sha256(json.dumps(params, sort_keys=True).encode()).hexdigest()[:16]}". The JSON-sorted-keys trick is the secret: {"a":1,"b":2} and {"b":2,"a":1} produce the same hash.
  • Truncating to 16 hex chars keeps keys short (Redis has a per-key memory cost).

get(key) -> Optional[Any]

  • await self.client.get(key) — returns the bytes previously stored, or None.
  • json.loads(value) reconstructs the original Python object.
  • Wrapped in try/except so any exception (ConnectionError, TimeoutError, JSONDecodeError) returns None rather than bubbling.

set(key, value, ttl)

  • await self.client.setex(key, ttl, json.dumps(value))setex is a single round-trip for “set with expiry”. The TTL is mandatory to prevent unbounded Redis growth.

delete(key) and flush_pattern(pattern)

  • delete removes a single known key (e.g. after invalidation on write).
  • flush_pattern uses scan_iter(match=pattern) (NOT keys(), which blocks Redis) to find and delete every key matching a glob.

Common Pitfalls

Using redis.keys(pattern) in production blocks the Redis event loop for the duration of the scan. Use redis.scan_iter(match=pattern) — it streams.

Storing pickle instead of json leaks a deserialisation vector. Anyone who can write to Redis can craft a malicious pickle. JSON keeps the surface small.

Calling client.get() on a closed connection returns cryptic ConnectionClosedError instead of None. Always wrap in try/except.

Real-World Interview Prep

Q1: Why is “Redis disabled = no-op” better than “Redis enabled = crash on outage”?

A: Cache should be invisible to the caller. If your app crashes on a Redis blip you’ve replaced one SPOF (the DB) with another (Redis). Always degrade gracefully.

Q2: What’s wrong with dict(some_obj) for cache keys?

A: dict(obj) only serialises top-level attributes — nested objects may be unhashable or non-deterministic. json.dumps(obj, sort_keys=True, default=str) is portable across processes and versions.

Q3: Why does setex beat set(...); expire(...)?

A: setex is one round-trip; set + expire is two. Two RTTs on a hot path doubles the latency. Atomic setex is also safe — a worker crash between set and expire would leave a key with no TTL (a slow leak).

Last updated on