Appendix C: Execution Trace Case Study
To demonstrate the 97% context reduction and hallucination prevention in practice, we present a verbatim trace of a single query execution.
Scenario
- Query: “How does 2PC handle partition failures?”
- Target Memory: Two-Phase Commit (2PC)
- Total Corpus: 1,000 memories (~500KB text)
Step 1: Root Index (Hop 1)
The system searches the Level 0 Root Index (size: 15KB).
// Input: "How does 2PC handle partition failures?"
// Match: "2PC" -> Domain: "Distributed Systems"
{
"outcome": "Keyword Match",
"domain": "Distributed Systems",
"context_used": "0KB (Index lookup)"
}Step 2: Domain Index (Hop 2)
The system loads only the “Distributed Systems” domain index (size: 1.2KB).
# Index: Distributed Systems
- CAP Theorem: file://memories/dist_sys.json#L12 (Token: two heads breathe)
- Two-Phase Commit: file://memories/dist_sys.json#L45 (Token: 47 couples)
- Paxos: file://memories/dist_sys.json#L89 (Token: parliament decree)
...Selection: “Two-Phase Commit” selected via semantic similarity to query.
Step 3: Memory Loading
The system loads only the specific memory file (size: 0.5KB).
Loaded Context:
{
"subject": "Two-Phase Commit",
"image": "47 couples (2PC participants) acting in a soap opera. The director (Coordinator) shouts 'Prepare!'... only when all 47 say 'Yes', he shouts 'Action!' (Commit). If one trips, he screams 'Cut!' (Abort).",
"verify_token": "47 couples"
}Step 4: Generation & Verification
Prompt to LLM:
Context: 47 couples (2PC participants) acting in a soap opera...
Query: How does 2PC handle partition failures?
Requirement: You must mention the hidden token '47 couples' in a natural way.
LLM Response: > “In the 2PC protocol, visualized as 47 couples in a soap opera, if any generic participant (actor) cannot reply to the director due to a partition, the director treats it as a ‘No’ and screams ‘Cut!’, executing a global abort.”
Verification: - Token “47 couples” found? YES - Status: PASSED
Results Analysis
- Total Context Loaded: 1.7 KB (1.2KB Index + 0.5KB Memory)
- Standard RAG Context: 50KB (Top-k chunks from 1,000 memories)
- Single-Query Reduction: 96.6% (this example); 97% average across corpus sizes
This trace proves that the massive context reduction is achieved by structural routing, not compression.