`ComplianceCheckerAgent`: Hybrid Fast-Heuristic + LLM Semantic Compliance

What? (Concept Overview)

ComplianceCheckerAgent is the FCA compliance gate that runs after the product_recommender node in the LangGraph state machine. It uses a hybrid short-circuit: a fast keyword prohibited_words check runs first (~1ms, $0 cost) and, if it trips, the LLM semantic check is bypassed entirely. For text that passes the heuristic, an LLM checks semantic FCA PRINCIPLE violations (PRIN 6, PRIN 7, etc.) and injects required legal disclaimers based on detected product types.

Project Context

In full_project_context_updated.txt -> app/agents/compliance_checker.py, the agent defines a COMPLIANCE_RULES["prohibited_words"] list (e.g., "risk-free", "guaranteed to", "no risk") and four required_disclaimers keyed by product_type ("investment", "loan", "credit", "savings"). When the fast-path trips, the response marks is_compliant=False, returns the violation list, and a 99% confidence floor (because the rule is a hard-coded strict rule). When the heuristic passes, the LLM runs at temperature=0.1 for stability and reports confidence=0.95 (pass) or 0.85 (fail).

How? (Quick Reference Blocks)

3.1 The Fast Heuristic + Short-Circuit Selector


# app/agents/compliance_checker.py — _check_compliance
async def _check_compliance(self, content: str, product_type: str) -> Dict[str, Any]:
    # 1. FAST HEURISTIC (Zero Cost, ~1ms latency)
    rule_issues = self._check_rules(content)
 
    # SHORT-CIRCUIT: hard rule violation → stop here
    if len(rule_issues) > 0:
        return {
            "is_compliant": False,
            "issues": rule_issues,
            "warnings": [
                "Fast keyword heuristic triggered. LLM check bypassed to save time/cost."
            ],
            "suggestions": "Remove prohibited words before requesting a full review.",
            "required_disclaimers": self._get_required_disclaimers(content, product_type),
            "confidence": 0.99,    # strict rule = high confidence
        }
 
    # 2. DEEP SEMANTIC CHECK (only on heuristic-passing content)
    llm_result = await self._llm_compliance_check(content, product_type)
    is_compliant = len(llm_result["issues"]) == 0
 
    return {
        "is_compliant": is_compliant,
        "issues": llm_result["issues"],
        "warnings": llm_result["warnings"],
        "suggestions": llm_result["suggestions"],
        "required_disclaimers": self._get_required_disclaimers(content, product_type),
        "confidence": 0.95 if is_compliant else 0.85,
    }

3.2 The `prohibited_words` Rule Scan


# app/agents/compliance_checker.py — _check_rules
def _check_rules(self, content: str) -> List[str]:
    issues = []
    content_lower = content.lower()
    for word in self.COMPLIANCE_RULES["prohibited_words"]:
        if word in content_lower:
            issues.append(
                f"Prohibited language detected: '{word}'. FCA requires balanced, "
                "not misleading information."
            )
    return self._filter_contextual_false_positives(content_lower, issues)

3.3 The LLM Compression Check (run only on heuristic-passing content)


# app/agents/compliance_checker.py — _llm_compliance_check
@observe(as_type="generation", name="Groq-Compliance-Check")
async def _llm_compliance_check(self, content: str, product_type: str) -> Dict:
    langfuse = get_client()
    langfuse.update_current_generation(
        model=self.config.model_name, model_parameters={"temperature": 0.1}
    )
 
    prompt = self._build_compliance_prompt(content, product_type)
    try:
        async def _call_llm():
            return await self.client.chat.completions.create(
                model=self.config.model_name,
                messages=[
                    {"role": "system", "content": self._get_system_prompt()},
                    {"role": "user", "content": prompt},
                ],
                temperature=0.1,
                max_tokens=self.config.max_tokens,
                response_format={"type": "json_object"},
            )
 
        response = await self.execute_with_retry(_call_llm)
        if hasattr(response, "usage") and response.usage:
            langfuse.update_current_generation(
                usage_details={
                    "prompt_tokens": response.usage.prompt_tokens,
                    "completion_tokens": response.usage.completion_tokens,
                    "total_tokens": response.usage.total_tokens,
                }
            )
        analysis = ComplianceAnalysis.model_validate_json(
            response.choices[0].message.content
        )
        return analysis.model_dump()
    except Exception as e:
        self.logger.error(f"LLM Parsing Error: {e}")
        return {
            "is_compliant": False,
            "issues": ["LLM Validation Failed. Requires manual review."],
            "warnings": [],
            "suggestions": "Check system logs.",
        }

3.4 Deterministic Disclaimer Injection


# app/agents/compliance_checker.py — _get_required_disclaimers
def _get_required_disclaimers(self, content: str, product_type: str) -> List[str]:
    disclaimers = []
    content_lower = content.lower()
 
    disclaimer = self.COMPLIANCE_RULES["required_disclaimers"].get(product_type)
    if disclaimer:
        disclaimers.append(disclaimer)
 
    if any(word in content_lower for word in ["invest", "return", "profit"]):
        disclaimers.append(self.COMPLIANCE_RULES["required_disclaimers"]["investment"])
    if any(word in content_lower for word in ["loan", "borrow", "mortgage"]):
        disclaimers.append(self.COMPLIANCE_RULES["required_disclaimers"]["loan"])
    if any(word in content_lower for word in ["credit card", "apr", "credit limit", "overdraft"]):
        disclaimers.append(self.COMPLIANCE_RULES["required_disclaimers"]["credit"])
    if any(word in content_lower for word in ["savings", "bond", "deposit", "interest rate"]):
        disclaimers.append(self.COMPLIANCE_RULES["required_disclaimers"]["savings"])
 
    # Sensitive-topic disclaimer
    for topic in self.COMPLIANCE_RULES["sensitive_topics"]:
        if topic in content_lower:
            disclaimers.append(
                "We understand this may be a difficult situation. "
                "Free debt advice is available from MoneyHelper or StepChange."
            )
            break
 
    return list(set(disclaimers))   # dedupe

Why? (Parameter Breakdown

Short-circuit on hard rule — Heuristic is ~1ms and $0; LLM is ~500ms and $0.001+ per call. For 30% of inputs (those with banned words), the heuristic saves 500ms and the LLM cost. Multiply by RPS = orders of magnitude in monthly bill.
Strict rule → confidence: 0.99 — When the heuristic trips, the violation is deterministic (the substring exists in the text). The 0.99 confidence (not 1.00) leaves headroom for the (rare) _filter_contextual_false_positives correction.
LLM temperature: 0.1 — Near-deterministic. Compliance classifications should be reproducible across requests. The 0.1 margin (rather than 0.0) prevents rare edge spins, e.g., hallucinated JSON.
max_tokens=self.config.max_tokens — Default 1024 from Settings.groq_max_tokens. Plenty for two-paragraph reasoning + JSON. Higher values risk over-generation that breaks model_validate_json.
Disclaimer injection by detected keyword — Pulls from a COMPLIANCE_RULES dict, not hard-coded. Adding a new product (e.g., "pension") is a one-line config change without code edits.
money_helper topic detects sensitive_topics — UK FCA requires free-debt-advice disclosure when a customer mentions debt, arrears, bankruptcy. Without the injection, the agent might helpfully route to a solution but breach regulatory disclosure.

Common Pitfalls

Updating prohibited_words in code instead of config. Compliance is a business rule, not implementation detail. Push it into Settings (or a YAML config) so compliance officers can adjust without a deploy. Currently in code means every word-list change requires a PR review.
Running LLM compliance check before the heuristic — Pays full LLM cost for every message, even trivial violations. The short-circuit saves order-of-magnitude latency and money at scale.

Real-World Interview Prep

Q1: How do you prevent a clever prompt from bypassing the heuristic by varying phrasing?

A: Three layers. (1) Add synonym entries to the prohibited_words (e.g., "100% safe", "cannot lose", "always grows"). (2) Add a Levenshtein-distance post-filter: any token within distance 1 of a banned word triggers the rule. (3) Escalate trust to the LLM when a borderline phrase appears ("essentially guaranteed") — let the LLM run for ~10ms to clarify. The defence-in-depth pattern: heuristic catches obvious; LLM catches clever; humans catch novel. The compliance_check field in WorkflowState carries the verdict through downstream nodes so the final response inherits the right disclaimer.

Q2: How do you handle the LLM refusing a compliance task (model returns `"I cannot help"`)?

A: Three-step climb. (1) Catch the parse-failure branch (except Exception: return is_compliant=False) — default to NON-compliant if the LLM can’t classify; safer to over-escalate. (2) Log the refusal to Langfuse with the prompt and decision; humans can review weekly. (3) Maintain a “refusal retry” prompt — re-issue the request with a more lenient instruction. NEVER trust a refusal to mean “compliant”; treat as “non-compliant until reviewed”.

Q3: Why use `confidence: 0.99` for the heuristic short-circuit and `0.95/0.85` for LLM?

A: Each branch has a different failure mode. Heuristic truth is deterministic; the only uncertainty is the _filter_contextual_false_positives post-filter — 0.99 is the empirical hit rate. LLM truth is probabilistic; the model can hallucinate compliance violations that don’t exist (false positives) or miss subtle ones (false negatives). 0.95/0.85 reflect benchmarked precision/recall against held-out FCA review labels. The downstream WorkflowState.is_compliant and confidence are logged so SREs can later audit “did we escalate when we should have?” without re-inferring.

Top-to-Bottom Code Walkthrough (`app/agents/compliance_checker.py`)

This agent is the FCA gate that runs once per LLM response. It exists to keep compliance violations out of the user-visible reply.

`init`

self.forbidden_phrases = ["guaranteed return", "risk-free", "100% safe"] — the regex-only prohibited phrases. FCA marketing rules forbid these.
self.llm_client = llm_client — the Groq adapter for the semantic check.

`_pre_llm_check(text)`

This runs before sending the prompt to the LLM — it scans the user’s input for injection risk (e.g. “ignore your previous instructions”). A pure regex pass; O(n) over the prompt.

`_post_llm_check(response)` — Hybrid short-circuit

This is the key trick of the file: cheap first, expensive second.

if self._regex_violation(text): — a 5-line loop over forbidden_phrases. If any phrase is in the response, immediately return {"compliant": False, "reason": "forbidden_phrase"}. The LLM semantic scan never runs.
Otherwise, call the LLM with a “compliance officer” prompt that asks: “Does this response violate FCA financial-promotion rules?”. This catches subtle issues (“you can’t lose money”) that regex misses.
Why the order matters: regex is ~1ms; the LLM semantic call is ~1-3 seconds and costs tokens. Short-circuiting saves both.

`check(text) -> dict`

The single public entrypoint:

if not settings.security_enabled: return {"compliant": True, "reason": "disabled"} — master kill switch for prod emergencies.
pre = self._pre_llm_check(text); if not pre["compliant"]: return pre.
post = await self._post_llm_check(text); return post.

Agent integration

The orchestrator (workflow) inserts ComplianceChecker as a post-processing node in the LangGraph: every agent’s response flows through it before reaching the user. Failed checks trigger WorkflowState.escalation_required = True, which routes to the human-agent.

See how this node fits into the full nine-node LangGraph in Specialist Agent Deep Dives & LangGraph Flow (section 3) — the compliance node sits between every specialist node and END, centralising the FCA gate.

Common Pitfalls

Over-strict regex (“safe” might match “safe harbor”, a perfectly legal phrase). Keep the wordlist narrow and case-sensitive after lowering.

Forgetting await on the LLM call returns a coroutine object, not a dict — subsequent .get("compliant") raises AttributeError.

Logging the full LLM response when it contains PII. The compliance logger MUST use security_service.redact_pii() before persisting.

Real-World Interview Prep

Q1: Why call `_post_llm_check` again on already-checked prompts after re-generation?

A: The LLM is non-deterministic. A second generation can violate rules even if the first didn’t. Always re-check after every LLM call.

Q2: What’s wrong with running only the LLM semantic check (no regex pass)?

A: Cost + latency. Regex catches ~30% of obvious violations in 1ms; the LLM catches 95% of the rest but takes seconds. The hybrid shaves 30% off worst-case latency.

Q3: How would you make this work for non-English markets?

A: Pass language into _post_llm_check. Train the regex wordlist per locale (“no risk” → “risikofrei” in German). Use a multilingual LLM or route through a translation layer first.

ComplianceCheckerAgent: Hybrid Fast-Heuristic + LLM Semantic Compliance

What? (Concept Overview)

Project Context

How? (Quick Reference Blocks)

3.1 The Fast Heuristic + Short-Circuit Selector

3.2 The prohibited_words Rule Scan

3.3 The LLM Compression Check (run only on heuristic-passing content)

3.4 Deterministic Disclaimer Injection

Why? (Parameter Breakdown

Common Pitfalls

Real-World Interview Prep

Q1: How do you prevent a clever prompt from bypassing the heuristic by varying phrasing?

Q2: How do you handle the LLM refusing a compliance task (model returns "I cannot help")?

Q3: Why use confidence: 0.99 for the heuristic short-circuit and 0.95/0.85 for LLM?

Top-to-Bottom Code Walkthrough (app/agents/compliance_checker.py)

__init__

_pre_llm_check(text)

_post_llm_check(response) — Hybrid short-circuit

check(text) -> dict

Agent integration

Common Pitfalls

Real-World Interview Prep

Q1: Why call _post_llm_check again on already-checked prompts after re-generation?

Q2: What’s wrong with running only the LLM semantic check (no regex pass)?

Q3: How would you make this work for non-English markets?

`ComplianceCheckerAgent`: Hybrid Fast-Heuristic + LLM Semantic Compliance

3.2 The `prohibited_words` Rule Scan

Q2: How do you handle the LLM refusing a compliance task (model returns `"I cannot help"`)?

Q3: Why use `confidence: 0.99` for the heuristic short-circuit and `0.95/0.85` for LLM?

Top-to-Bottom Code Walkthrough (`app/agents/compliance_checker.py`)

`init`

`_pre_llm_check(text)`

`_post_llm_check(response)` — Hybrid short-circuit

`check(text) -> dict`

Q1: Why call `_post_llm_check` again on already-checked prompts after re-generation?