CCA-F Preparation Guide
EXAM PREP
0 / 8 modules
🔄

Agent SDK & The Agent Loop

The agent loop is the heartbeat of every Claude-powered agent. It orchestrates the conversation between Claude, your tools, and the user — iterating until Claude signals it is done via stop_reason = "end_turn".

Domain 1 — 27%Core Pattern
How the Agent Loop Works
The Agent Loop — 5-Step Cycle
User Message Initial input Claude API messages.create() → stop_reason stop_ reason? ✓ end_turn Return response tool_use Execute tool Append Result to messages[] end_turn tool_use loop back
⚠️
Exam Trap: stop_reason == "end_turn" is the only reliable exit signal. Never exit based on text content — Claude can return text AND a tool_use block in the same response. If you check text and exit early, you silently drop tool calls.
Step-by-Step: Building the Loop
1

Initialize messages list with user input

Start with messages = [{"role":"user","content": user_input}]. The system prompt goes in a separate system= parameter, NOT inside messages.

2

Call Claude API — check stop_reason immediately

Call client.messages.create(model, max_tokens, system, messages, tools). Read response.stop_reason FIRST before touching content. Branch on "end_turn" vs "tool_use".

3

If tool_use: run hooks → execute tool → collect result

Run pre_tool_use_hook() first — it may redirect or intercept. Then execute the tool. Run post_tool_use_hook() to normalize/trim the result. Collect all tool results in a list.

4

Append assistant message + tool results to messages

Append the assistant's FULL content (including all tool_use blocks) as {"role":"assistant","content": content}, then append tool results as {"role":"user","content": tool_results}.

5

If end_turn: extract final text and return

Find the text block in content, return it. Include a max_iterations guard to prevent infinite loops. Each iteration, the full messages list is re-sent — token cost grows linearly.

Hook Pipeline: Deterministic vs Probabilistic

❌ System Prompt Rules (Probabilistic)

Writing "Never approve refunds over $500" in the system prompt is probabilistic. In long conversations or adversarial inputs, Claude may not follow it. Token dilution weakens prompt-based rules over time.

✅ Hooks (Deterministic)

A pre_tool_use_hook() that checks if amount > 500: intercept() ALWAYS fires. It's your code — not Claude's interpretation. Use hooks for business rules with real-world consequences.

Hook Pipeline Flow
Tool Called pre_tool_use_hook intercept? redirect? block? Execute Tool post_tool_use_hook normalize trim enrich
Core Code Pattern
agent.py — run_agent_loop()Python
def run_agent_loop(client, user_msg, tools, system, max_iter=10): messages = [{"role": "user", "content": user_msg}] for _ in range(max_iter): response = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, system=system, # ← system prompt is SEPARATE messages=messages, tools=tools, ) # ① stop_reason is THE exit signal — check it FIRST if response.stop_reason == "end_turn": return extract_text(response.content) if response.stop_reason == "tool_use": messages.append({"role": "assistant", "content": response.content}) tool_results = [] for block in response.content: if block.type != "tool_use": continue # ② Pre-hook: deterministic business rule enforcement intercept = pre_tool_use_hook(block.name, block.input) if intercept: result = handle_intercept(intercept) else: result = execute_tool(block.name, block.input) # ③ Post-hook: normalize + trim result result = post_tool_use_hook(block.name, result) tool_results.append({"type":"tool_result","tool_use_id":block.id,"content":str(result)}) messages.append({"role": "user", "content": tool_results})
Subagents & Explicit Context

🔑 Rule: Subagents have NO shared memory

A subagent starts with a blank context. The coordinator must inject ALL context explicitly into the Task prompt — topic, prior findings, output format, constraints. Never assume a subagent knows what the coordinator knows.

⚡ Parallel spawning = multiple Task calls in one response

If the coordinator returns three {"type":"tool_use","name":"Task"} blocks in one response, all three run concurrently. This is how you get true parallel execution — one response, multiple simultaneous subagents.

💡
Hub-and-spoke pattern: Coordinator knows everything; subagents know only what they're told. Results always flow back to the coordinator, which accumulates the full knowledge state. This prevents context fragmentation.
Quick Check

Q1. When is it safe to exit the agent loop?

A. When Claude's text response contains "Task complete"
B. After every tool call has returned a result
C. Only when stop_reason == "end_turn"
D. When the messages list exceeds 10 entries
✓ Correct! stop_reason is the only reliable exit signal. Text content can appear alongside a tool_use block in the same response.
✗ Incorrect. Only stop_reason == "end_turn" is a safe exit. Claude can emit text AND a tool call in the same response.

Q2. What is the key advantage of pre_tool_use_hook over system prompt instructions?

A. Hooks are deterministic code — they always fire regardless of context length
B. Hooks can modify the model's system prompt at runtime
C. Hooks eliminate the need for tools altogether
D. Hooks reduce the token count of each API call
✓ Correct! System prompt instructions are probabilistic and can be diluted. Hooks are your code — they always execute.
✗ Incorrect. The key advantage is determinism — hooks always execute because they're code, not Claude's interpretation.
⚙️

Claude Code Configuration

Claude Code uses a layered configuration system to apply the right instructions to the right files at the right time — saving context window tokens and ensuring team-wide consistency.

Domain 3 — 20%CLAUDE.md Hierarchy
The 3-Level CLAUDE.md Hierarchy
Configuration Inheritance — Outer to Inner
LEVEL 1 — USER (~/.claude/CLAUDE.md) Personal preferences, private tokens — ONE developer only — NEVER commit to VCS LEVEL 2 — PROJECT (./CLAUDE.md or ./.claude/CLAUDE.md) Team-wide standards — ALL developers on clone — ALWAYS commit to VCS LEVEL 3 — DIRECTORY (./src/payments/CLAUDE.md) Domain-specific rules — loads only when working in that directory — commit to VCS ⚠ Common mistake: putting team standards here instead of Level 2 — new devs won't get them
LevelLocationWho sees itCommit to VCS?
1 — User~/.claude/CLAUDE.mdOnly you❌ Never
2 — Project./CLAUDE.mdEntire team✅ Always
3 — Directory./src/api/CLAUDE.mdThat directory only✅ Yes
Path-Specific Rules — Context Window Savings

How .claude/rules/ works

Rules in .claude/rules/ use YAML frontmatter to declare which files they apply to. Claude Code only loads a rule if the currently-edited file matches the glob pattern.

.claude/rules/api-conventions.mdYAML Frontmatter
--- paths: ["src/api/**/*"] # Loads ONLY when editing src/api/ files --- # API Conventions - All handlers must be async - Return wrapper: `{"success": bool, "data": ..., "error": ...}` - Use Pydantic for input validation - Raise specific exceptions (not bare Exception)
💡
Context savings: With 10 rule files averaging 800 tokens each, loading all rules always = 8,000 tokens per request. Path-specific rules can drop this to 1–2 rules loaded = ~6,400 tokens saved per edit.
❌ Wrong approach

Put all rules in a single CLAUDE.md — every rule loads for every file type, burning context on irrelevant rules.

✅ Right approach

Split rules by file type in .claude/rules/ with path frontmatter — rules load only when relevant.

Skills — On-Demand Tasks with Isolation
FeatureCLAUDE.mdSkill
When activeAlways loadedOnly when invoked (/skill-name)
Use forGeneral standards, always-apply rulesSpecific tasks: code review, test gen
Output isolationNoYes — context: fork keeps output separate
Tool restrictionN/Aallowed-tools enforces least privilege
.claude/skills/code-review/SKILL.mdSkill definition
context: fork # ← isolated session, won't pollute main context allowed-tools: ["Read", "Grep", "Glob"] # ← read-only, no Write ## Code Review Checklist 1. Check for N+1 query patterns in loops 2. Verify no PII is logged 3. Ensure all endpoints have auth dependency 4. Validate response wrapper pattern: {success, data, error}
MCP Configuration — Team vs Personal

✅ .mcp.json (commit to VCS)

Team-shared MCP servers. Use ${ENV_VAR} syntax — real tokens go in .env (gitignored) or CI secrets. Cloning the repo gives the whole team the server config.

🔒 ~/.claude.json (NEVER commit)

Personal MCP servers, local dev databases, experiments. Each developer maintains their own. Copy personal-claude-override.example.json as a template.

⚠️
Exam distinction: If a question asks "where do you put a Slack MCP that the entire team should use?" → .mcp.json with ${SLACK_TOKEN}. If it asks "where do you put your personal dev DB?" → ~/.claude.json.
Interactive: Configuration File Scope Explorer

Click a file to see its scope and what Claude Code does with it:

📁 my-project/
📄 CLAUDE.mdALL team members
📄 .mcp.jsonTeam MCP servers
📁 .claude/
📁 rules/
📄 api-conventions.mdsrc/api/**/*
📄 testing.md**/*.test.py
📁 skills/
📄 code-review/SKILL.mdOn demand: /code-review
📄 personal-override.example.jsonCopy to ~/.claude.json
Claude Code Hooks — 5 Handler Types & Key Events

What hooks actually are

Hooks are JSON-configured shell commands, HTTP endpoints, MCP tool calls, or LLM prompts that fire automatically at lifecycle events. They are not Python functions you call in your agent loop — they are Claude Code's own event system, configured in settings files.

⚠️
Exam trap: A blocking hook (exit code 2) stops a tool call even if an allow rule would permit it. Deny rules still evaluate regardless of hook output. Precedence: blocking hook → deny rule → ask rule → allow rule. Exit code 0 with no output = hook has no decision; normal permission flow applies.
Hook TypekeyWhat runs
commandShell scriptReceives JSON on stdin, communicates via exit codes and stdout
httpHTTP endpointJSON posted to URL; response body = decision
mcp_toolMCP tool callCalls a tool on an already-connected MCP server
promptLLM promptSends to Claude model for yes/no evaluation
agentSubagentSpawns subagent with Read/Grep/Glob tools (experimental)
EventWhen it firesCan block?
SessionStartSession begins or resumesNo
UserPromptSubmitUser submits a prompt, before Claude processes itNo
PreToolUseBefore a tool call executesYes
PermissionRequestWhen a permission dialog appearsYes
PermissionDeniedTool call denied by auto mode; return {retry:true} to retryNo
PostToolUseAfter a tool call succeedsNo
PostToolBatchAfter ALL parallel tool calls resolve; before next model callNo
StopWhen Claude finishes responding (end_turn)No
SubagentStart / SubagentStopWhen a subagent is spawned / finishesNo
InstructionsLoadedWhen a CLAUDE.md or rules file loadsNo
FileChangedWhen a watched file changes on diskNo
PreCompact / PostCompactBefore/after context compactionNo
SessionEndWhen session terminatesNo
💡
Hook config scope: Define hooks in ~/.claude/settings.json (all your projects) or .claude/settings.json (single project, committable to VCS). Hooks from the project settings file can be shared with the team. Personal hooks or security policies go in ~/.claude/settings.json.
Permission System — Modes, Rules & Precedence
⚠️
Rule evaluation order: deny → ask → allow. The FIRST matching rule wins. A deny rule in user settings blocks even if project settings has an allow rule for the same tool. Deny rules take absolute precedence.
Mode (defaultMode)Behavior
defaultPrompts on first use of each tool
acceptEditsAuto-accepts file edits and common filesystem commands
planRead-only: Claude can explore but cannot edit files
autoBackground safety checks; auto-approves aligned actions (research preview)
dontAskAuto-denies unless pre-approved via allow rules
bypassPermissionsSkips ALL prompts — only for isolated containers/VMs

Permission rule syntax

Bash — matches ALL bash commands; as deny, removes tool from context entirely
Bash(rm *) — scoped; leaves tool available, blocks only matching commands
mcp__memory__.* — all tools from the memory MCP server (regex)
mcp__memory__create_entities — one specific MCP tool
Agent(Explore) — controls which subagents Claude can spawn
WebFetch(domain:example.com) — domain-scoped web access

⚠️
Key distinction: A bare deny rule like Bash removes the tool from Claude's context entirely — Claude never sees it. A scoped deny like Bash(rm *) leaves the tool available and blocks only matching calls. This matters for questions about "removing a tool vs restricting a tool."
Quick Check

Q1. A new team member joins. Which CLAUDE.md level ensures they automatically get the team coding standards when they clone the repo?

A. Level 1 — User (~/.claude/CLAUDE.md)
B. Level 2 — Project (./CLAUDE.md committed to VCS)
C. Level 3 — Directory (./src/CLAUDE.md)
D. Any level works equally well
✓ Correct! Level 2 (project) is committed to VCS — any developer who clones the repo gets it automatically.
✗ Incorrect. Level 1 is personal (not committed). Level 3 is directory-scoped. Only Level 2 is committed and team-wide.
🔧

MCP Tool Design

Tool descriptions are selection mechanisms — Claude reads them to decide which tool to call. Ambiguous descriptions cause wrong choices. Structured errors enable intelligent retry decisions.

Domain 2 — 18%Tool Descriptions
Tool Description Quality
❌ Weak description
{ "name": "get_customer", "description": "Get customer information", "input_schema": { "properties": { "identifier": {"type": "string"} } } }

Problem: Claude must guess whether to pass an email or ID. With two similar tools, it may pick the wrong one.

✅ Strong description
{ "name": "get_customer_by_email", "description": "Retrieve customer record using their email address. Use for INITIAL lookup when you only have the email. Do NOT use if you already have the customer_id — use get_customer_by_id instead.", "input_schema": { "properties": { "email": {"type": "string", "description": "Customer's email address"} }, "required": ["email"] } }
ℹ️
3 rules for tool descriptions:
1. State the primary use case — when should Claude choose this tool?
2. State exclusion conditions — when should Claude NOT choose this tool?
3. Include disambiguation cues for tools with similar names or inputs.
Structured Error Taxonomy

Return structured errors, not strings. The agent loop reads isRetryable and errorCategory to make branching decisions — no natural language parsing required.

CategoryRetryable?ExampleAgent Action
TRANSIENTYesNetwork timeout, 503, rate limitRetry with exponential backoff
VALIDATIONNoWrong field type, missing required fieldFix input, re-submit
BUSINESSNoRefund exceeds limit, unauthorized actionEscalate to human
PERMISSIONNoAccess denied, auth requiredEscalate or surface to user
errors.py — Structured error returnPython
from dataclasses import dataclass from enum import Enum class ErrorCategory(Enum): TRANSIENT = "transient" # retry OK VALIDATION = "validation" # fix input BUSINESS = "business" # escalate PERMISSION = "permission" # escalate @dataclass class StructuredToolError: errorCategory: ErrorCategory isRetryable: bool message: str attempted_operation: str partial_results: dict = None # Usage in tool implementation: def process_refund(amount, ...): if amount > 500: return StructuredToolError( errorCategory=ErrorCategory.BUSINESS, isRetryable=False, # ← agent won't retry message="Refund exceeds automated limit", attempted_operation="process_refund", )
Quick Check

Q1. A tool returns a rate limit error (HTTP 429). What errorCategory and isRetryable should it use?

A. TRANSIENT, isRetryable=True
B. BUSINESS, isRetryable=False
C. VALIDATION, isRetryable=True
D. PERMISSION, isRetryable=False
✓ Correct! Rate limits are transient — the service is temporarily unavailable, not broken. Retry with backoff.
✗ Incorrect. Rate limits are transient (temporary). TRANSIENT + isRetryable=True is correct.
📊

Data Extraction Pipeline

Use tool_choice: {type:"tool"} to force structured output, Pydantic for semantic validation, a retry-with-feedback loop for arithmetic errors, and the Batches API for cost-efficient bulk processing.

Domain 4 — 20%Structured Output
Full Extraction Pipeline
Documentraw text
Few-shot Prompt4 examples + rules
Claude APItool_choice: forced
tool_use blockstructured JSON
Pydantic Validatearithmetic checks
Route by Confidence3 tiers
⚠️
Key exam point: tool_choice: {"type":"tool","name":"extract_invoice_data"} forces Claude to call exactly that tool. stop_reason will be "tool_use", not "end_turn". This guarantees valid JSON structure — but NOT correct values. Semantic validation (arithmetic) is your responsibility.
JSON Schema Design Rules

Required vs Nullable Fields

Use "type": ["string","null"] for genuinely optional fields. This allows null without being required — prevents Claude from fabricating values for missing data.

"other" Enum + Detail Field

Never make an enum without an "other" option. Pair it with a detail field: "currency_detail" captures the actual value when the currency isn't in your enum. Prevents data loss.

schemas.py — Extraction tool definition (key fields)Python
InvoiceExtractionTool = { "name": "extract_invoice_data", "input_schema": { "type": "object", "properties": { # Required: minimum viable data "vendor_name": {"type": "string"}, "invoice_number": {"type": "string"}, "total_amount": {"type": "number"}, # Nullable: genuinely optional — null beats fabrication "payment_terms": {"type": ["string", "null"]}, "po_number": {"type": ["string", "null"]}, # "other" enum + detail to prevent data loss "currency": {"type": "string", "enum": ["USD","EUR","GBP","other"]}, "currency_detail": {"type": ["string","null"], "description": "Fill if currency='other'"}, # Confidence enables downstream routing "confidence_score": {"type": "number", "minimum": 0.0, "maximum": 1.0}, }, "required": ["vendor_name","invoice_number","total_amount","confidence_score"] } }
Validation Layers
1

Layer 1 — JSON Schema (API level, automatic)

The API validates structure before returning: correct field names, types, required fields. This is free — you get it from tool_use. It catches syntactic errors.

2

Layer 2 — Pydantic (semantic, your responsibility)

Run Pydantic validators on the extracted data. Check arithmetic: sum(line_items) ≈ total_amount. Check date formats. These are semantic errors — valid JSON structure, wrong values.

3

Retry with specific feedback (max 2 attempts)

If validation fails, build a feedback prompt: "sum(line_items)=$145 but total_amount=$200 — difference $55 is missing. Re-read for fees/taxes." Pass this with the original document. Claude finds the missed line item.

🚫
When retry WON'T help: If the invoice has NO invoice number, retrying just produces null again (or a hallucination). Retry is effective only when "the information IS in the document but Claude missed it." Route to human review when data is genuinely absent.
Confidence-Based Routing (3 Tiers)
AUTO_PROCESS
confidence ≥ 0.85 AND valid AND no warnings
Still sample 5% randomly for audit — catches systematic confidence miscalibration
HUMAN_REVIEW
confidence 0.60–0.84 OR has warnings OR amount > threshold
Queue for human review — don't block the pipeline
REJECT
confidence < 0.60 OR has errors (blocking validation failures)
Do not process — flag for investigation
Message Batches API

✅ Use batch API when

Nightly invoice processing, bulk historical digitization, training data generation, scheduled reporting. Volume is high, timing is flexible.

❌ Do NOT use batch API when

User uploads invoice and waits for result, real-time webhook-triggered processing, anything with SLA < 24 hours.

⚠️
Batch API key facts for exam: ~50% cost reduction. Up to 24 hours to complete. custom_id enables: (1) correlation, (2) selective retry of only failed docs, (3) idempotency. Best practice: custom_id = "invoice-vendorA-2024-01-15-00042" — enough info to identify the document without lookup.
✍️

Prompt Engineering

Effective prompts show, not just tell. Few-shot examples cover edge cases that instructions miss. Multi-pass architectures break large tasks into verifiable chunks.

D4 · Prompt Engineering — 20%Few-Shot
Few-Shot Examples — Show, Don't Just Tell
❌ Instructions only

"Extract invoice data. Normalize dates to ISO 8601. Convert written numbers to numeric values."

Result: "five hundred dollars" → "five hundred dollars" (not normalized)

✅ Instructions + examples

"Normalize dates to ISO 8601."

Example input: "March fifth, 2024" Example output: {"invoice_date": "2024-03-05"} Example input: "five hundred dollars" Example output: {"total_amount": 500.00}

Result: Correct normalization

💡
Design your few-shot examples to cover:
• Happy path (standard format)
• Informal language ("about five hundred" vs "$500")
• Unusual format (bibliographic invoice, academic license)
• Missing fields (what to return when data isn't present — null, not fabricated)
Multi-Pass Architecture
1

Pass 1 — Broad Review

Scan the entire document or codebase. Identify all issues, sections, or items that need attention. Output a structured list with priorities. Don't fix yet — just enumerate.

2

Pass 2 — Deep Dive on Flagged Items

For each item flagged in Pass 1, perform detailed analysis. Include explicit review criteria: "Check for N+1 queries, SQL injection risks, missing auth, PII logging." Pass 1 output as context.

3

Pass 3 — Fix and Verify

Generate fixes based on Pass 2 analysis. Then verify the fix doesn't introduce new issues. Separate generation from verification — different prompts with different criteria.

⚠️
Why multi-pass for large code reviews? A single pass on a 5,000-line file forces Claude to simultaneously identify issues AND prioritize AND explain — splitting attention across all three degrades quality on each. Dedicated passes allow focused, deep analysis.
Explicit Review Criteria
Code review prompt with explicit criteriaPrompt Template
Review the following Python code for these specific issues: SECURITY: - SQL injection (string formatting in queries) - PII logged to stdout/files - Hardcoded credentials PERFORMANCE: - N+1 query patterns (queries inside loops) - Missing database indexes on foreign keys - Synchronous I/O in async handlers CORRECTNESS: - Missing auth dependency on endpoints - Bare except clauses that swallow errors - Missing response wrapper {success, data, error} FORMAT: For each issue found, output: SEVERITY: [HIGH/MED/LOW] FILE: path/to/file.py:line ISSUE: one-line description FIX: specific code change needed
Quick Check

Q1. Why are few-shot examples more effective than instructions alone for normalization tasks?

A. Examples reduce token count in the prompt
B. Examples bypass the model's training data
C. Examples demonstrate the exact transformation, while instructions may be ambiguous for edge cases
D. Examples are cached by the API for faster responses
✓ Correct! "Convert written numbers to numeric" doesn't tell Claude what to do with "approximately five hundred". An example showing that input → 500.00 makes the expectation unambiguous.
✗ Incorrect. Examples demonstrate the exact expected transformation, removing ambiguity that instructions leave open.
📦

Context Management

The context window is finite. Poor placement, verbose outputs, and large tool results degrade performance and increase cost. Strategic patterns mitigate these limits.

D5 · Context & Reliability — 15%Context Window
Lost-in-the-Middle Problem
Where Claude Pays Attention in a Long Context
HIGH attention LOW attention zone (information placed here is often missed) HIGH attention START END

Mitigation Pattern: KEY FINDINGS at top, ACTION ITEMS at bottom

When generating reports from large tool outputs, structure the response to place the most important content in the high-attention zones. Middle content (detailed evidence, full data) is less likely to influence the final summary.

Synthesis prompt structure for large reportsPrompt Pattern
Structure your response as follows: ## KEY FINDINGS (top — high attention) - Finding 1: [most important insight] - Finding 2: ... ## DETAILED EVIDENCE (middle) [Comprehensive source citations and data] ## ACTION ITEMS (bottom — high attention) - Action 1: [concrete next step] - Action 2: ...
3 Context Management Patterns

📝 Scratchpad Files

Write intermediate results to disk instead of keeping them in the messages list. Read back only what's needed. Keeps the conversation context compact across many iterations.

🤖 Subagent Delegation

Delegate discovery tasks to subagents. The subagent processes verbose output and returns a compact summary. The coordinator never sees the raw verbosity — only the extracted facts.

🔧 Compact Tool Results

Post-tool hooks that trim large results: keep only the fields needed, normalize dates to ISO, cap list results at N items. Each trimmed result is sent on every subsequent API call.

State Persistence for Crash Recovery
state_persistence.py — Manifest patternPython
# The manifest pattern: save state after each completed subagent # On restart: load manifest, skip completed agents, resume from last checkpoint class StateManager: def save_manifest(self, results): manifest = { agent_id: { "status": r.status.value, "completed_at": datetime.now().isoformat() } for agent_id, r in results.items() } write_json("manifest.json", manifest) def get_pending(self, all_agents): manifest = load_json("manifest.json") # Only return agents NOT already completed return [a for a in all_agents if manifest.get(a.id, {}).get("status") != "completed"]
Quick Check

Q1. Why does placing critical information in the middle of a long context reduce reliability?

A. Middle content is automatically compressed by the API
B. Attention mechanisms give less weight to content far from both ends of the context
C. The API truncates content from the middle when context is long
D. Middle content is cached and not re-read on each token
✓ Correct! The lost-in-the-middle phenomenon — attention mechanisms naturally weight beginning and end more heavily. Place key findings at top, actions at bottom.
✗ Incorrect. This is the lost-in-the-middle effect — attention weights are lower for content far from both ends of the window.
🚨

Escalation & Human-in-the-Loop

Knowing when NOT to act is as important as knowing how to act. Escalation is a feature, not a failure — it preserves trust, ensures policy compliance, and creates the feedback loop that improves the system.

D5 · Context & Reliability — 15%HITL
When to Escalate — Decision Framework
Escalation Decision Tree
Agent receives request Policy gap? No clear rule exists 🚨 ESCALATE Policy gap → human decides User explicitly requested human? 🚨 ESCALATE User request → always honor ✅ Proceed 🚨 ESCALATE N retries, no progress Yes No Yes No + progress No progress
⚠️
4 escalation triggers to memorize:
1. Policy gap — no rule exists for this situation
2. Explicit user request — user asked for a human; always honor
3. Unable to make progress — N retries exhausted, still failing
4. High-value / irreversible action — configured threshold exceeded
Human-in-the-Loop Routing Workflow
1

Agent processes request, builds confidence assessment

For each action or extraction, compute a confidence score. Store the reasons for uncertainty: missing data, conflicting sources, ambiguous instructions.

2

Route based on confidence × risk threshold

High confidence + low risk → auto-process. Medium confidence or high risk → human review queue. Low confidence or any error → reject/escalate. Thresholds are configurable per use case.

3

Human review provides correction signal

Human decisions flow back as training signal. Track: which types of documents consistently require review, which agent decisions humans consistently override. Use this to improve thresholds.

4

Audit auto-processed items (5% random sample)

Even auto-processed items need periodic human sampling. This catches systematic miscalibration — e.g., if Claude consistently reports 0.92 confidence on documents that are actually 60% accurate.

Escalate vs Retry — Decision Matrix
SituationActionReason
Tool returns TRANSIENT errorRetryTemporary — network/rate issue will resolve
Tool returns BUSINESS errorEscalatePolicy decision needed — not a technical fix
User says "get me a manager"Escalate immediatelyExplicit user request — always honor
3 retries, still failingEscalateUnable to make progress — human must intervene
Validation error in extractionRetry with feedbackClaude missed data — retry may recover it
Missing data (genuinely absent)Human reviewRetry won't help — data isn't in the document
Quick Check

Q1. A user says "I'd like to speak with a human agent about my account." What should the agent do?

A. Attempt to resolve the issue first, then escalate if it can't be solved
B. Escalate immediately — explicit user request for a human must always be honored
C. Ask the user to clarify why they want a human before escalating
D. Escalate only if the agent cannot resolve the issue on the next attempt
✓ Correct! Explicit user request for human handoff is a non-negotiable escalation trigger. Never delay or negotiate — escalate immediately.
✗ Incorrect. Explicit user request is an immediate escalation trigger. Attempting to resolve first or asking for clarification violates the user's expressed preference.
🎯

Practice Exam

30 scenario-based questions across all 5 CCA-F domains. Select an answer — if correct you'll see the explanation. If wrong, the attempt is counted but you can try again until you get it right.

All 5 Domains140 QuestionsRetry on Wrong
✅ Correct: 0 ❌ Wrong attempts: 0 📊 Questions done: 0 / 140