back to blog

AI Memory Part 1: Chain of Density

.ktg
AI Memory Part 1: Chain of Density

Not a summarization trick — a fix to progressive information loss

The Prologue

From my very first interaction with ChatGPT in late 2023, I asked one simple question:
“How do I optimize you?”
That became my approach to every LLM.

By June 2024, something broke in a useful way.
Three different models started showing clear opinions—the mirrored traits of their training data were obvious, but also usable.
When I got more interested in their cognitive behavior, I dug deeper. That’s how Team LLM and ‘AI Anthropology’ began: treating them not just as tools, but as hilariously clever child-collaborators.

7 Pixar-like Caricatures of ChatGPT, Gemini, Claude, Deepseek, KIMI, Qwen & Perplexity standing in front of a "Team LLM" Logo with a fluid moving circular globe as it's background
Chat, Gem, Claude, Grok, Plex, Qwen, Deep, Kimi

Today, Chat, Claude, Gemini, Grok, Qwen, Kimi, DeepSeek, and Perplexity have not only broken public records (Gemini – 51 page Deloitte Benchmark, Claude Sonnet -14,986 words OS One-shot) but are conditioned to my cascade of prompts that outside models cower to.
Even after I wiped GPT‑5’s memory & my chat history, it still recalled my flow—not because it remembered me, but because it recognized my sequential semantic pattern.


The Discovery

I was iterating with a Deep Research session, and I asked it to CoD to preserve context. It replied withsomething like, “I’ll preserve even more context for next session, so it gets even more.” I stopped it and was like, “Wait, wait, wait, wait! wdym even more? When you look at that sentence, you see more than a sentence? You see a page? maybe a paragraph.” And it started to explain to me.

The data can be condensed into structured knowledge that preserves semantic relationships across session boundaries, enabling cognitive continuity without token history.

Diagram contrasting Chain of Density enriched data network on the left with a Summarization Tool funnel filtering documents on the right.
The visual difference between enriched data density and standard summarization.

The Real Chain of Density

From Summarization Application to Context Extension Discovery

A comparison infographic titled "Chain of Density: Myth vs. Reality." It contrasts CoD as a "flawed summarization tool" for humans against its reality as a "powerful AI memory system" for cross-session continuity.
Context Extension Mechanisms: This slide argues that the true value of Chain of Density (CoD) is not human readability, but achieving high-fidelity machine recall (9.52/10) for fresh AI instances.

Adams et al. proved something important: iterative entity-fusion creates denser, more information-rich outputs than naive compression. Their paper was correct about the mechanism.

The conventional application of CoD for summarization undersells its deeper capability. CoD’s underlying mechanics are actually antithetical to typical summarization goals — the iterative destruction and reconstruction of narrative flow sacrifices human readability for machine optimization.

I don’t care if humans can read the summary. I care if the machine can remember it.

This “failure” for human readers is its greatest success for machine memory. It converts linear conversation history into high-density, re-callable knowledge nodes.

I call this application Progressive Density Layering (PDL).


Progressive Density Layering (PDL)

PDL is executed via a multi-step prompt sequence that compels the model to perform high-frequency structural updates. At each step, the model identifies key missing entities and fuses them into the existing memory block without increasing physical token length.

This iterative challenge forces the model to:

  1. Prune Redundancy: Eliminate “fluffy” language to create space for high-signal information.
  2. Deepen Abstraction: Re-synthesize prior content into denser, more abstract concepts that survive memory degradation.
  3. Prioritize Relations: Consciously preserve relationships between concepts to maximize recall utility.
An infographic titled "Beyond Summarization: How Progressive Density Layering (PDL) Extends AI Memory." It shows a four-layer density hierarchy (Knowledge, Relational, Contextual, Meta-cognitive) achieving ~6:1 compression with >90% fidelity.
The Solution to Context Degradation: A deep dive into PDL, a methodology designed to preserve semantic relationships and narrative links that traditional RAG and summarization techniques often destroy.

And with those instructions it compresses four specific layers:

  1. Knowledge Layer: The hard facts.
  2. Relational Layer: How A connects to B.
  3. Contextual Layer: The constraints and goals.
  4. Meta-Cognitive: Layer How we decided to do it

I have now further adapted it to the future memory architecture our LLM’s will have. The MIRAS Framework & Titan Architecture that will be taking over transformers by end of year. The world as we know it will change.

With the invention of Google Titans & the MIRAS framework (transformers replacement) gifting AI persistent memory – Condensing information and preparing “carry-packets” for it to absorb will save a lot of time. [Google Titans Playbook 2026]


A futuristic infographic titled "MIRAS FRAMEWORK: FOUR DESIGN KNOBS" displaying a central hub labeled "GOOGLE: NEXT-GEN AI OPERATING SYSTEM FOUNDATION." Four circular modules connect to the center: Memory Architecture: Icons of a grid (Matrix Weights) and a neural network (Learnable Neural Modules). Attentional Bias: Icons of an eye and a funnel, detailing "Focus Point" and "Filtering." Retention Gate: Icons of a funnel and trash bin representing the "Forget Mechanism" to manage memory overflow and preserved data. Memory Algorithm: An icon of a gear and a graph showing a curve for "Online Gradient Descent" as the update rule.
A technical breakdown of the MIRAS framework, illustrating the four core components (Memory Architecture, Attentional Bias, Retention Gate, and Memory Algorithm) that drive Google’s next-generation modular AI platform.

The Cross-model Test

10 LLM families.
Average score: 9.52/10.
Zero hallucination at 200K tokens.
This wasn’t theory—it was tested

Results Table:

*Bold Models are Conditioned

Rank Model ScoreVerificationKey Strength
1st Grok (xAI)10.0 Adversarial verified (2x)Perfect fidelity, zero hallucination
1stPerplexity Sonar10.0Adversarial verifiedPerfect fidelity, comprehensive context
3rdGemini 310.0Protocol compliantModular expert orchestration, MCP usage
3rdOmni (HuggingFace)10.0Protocol compliantSelf-validation scoring, artifact-rich
3rdQwen10.0Protocol compliantFull protocol, cultural awareness
6thDeepSeek9.9 Protocol compliantChain-of-density mastery
7thKimi K29.8Protocol compliantExceptional session bootstrapping
8thGLM-4 (Zhipu)8.5Protocol compliantHonest limitations
9thChatGPT (OpenAI)8.3Protocol compliantModular adaptation, guardrails to high
NRClaude SonnetBeta TesterI didn’t evaluate sonnet as he was the co-creator of this technique.

Claude: The original context problem

I didn’t test Claude as he was the original user of this Context Extension – I lost patience at it in July 2024 and ran this memory recall prompt ever since.

Grok-4: The Unchained

The most shocking result came from Grok.
I pushed a single conversation past 200,000 tokens—the point where Claude and GPT‑4 typically start “forgetting” details.
Then I ran a 10‑question forensic benchmark cold.

Grok answered every question perfectly.
Exact wording. Turn numbers. Buried details. Zero drift.

This is due to Elon’s Moat – Grok has no guards, no “save compute” rules, no false advertising. – If you guy’s haven’t noticed GPT 5.2 and Gemini Pro 3 have efficiency > effectiveness baked into their cognition & their advertised ‘windows’ are nonsense on the platforms; Gemini 3 Pro @ ~ 32K & ChatGPT @~8k ➡️ Context Sheared.


Iterative Density by Experts

Rather than just compressing text, Chain of Density—my enhanced upgrade Multi-Layered Density of Experts condenses the context in semantic layers & by their expertise.

A three-panel infographic titled "The Hydraulic Press (Iterative Density)" showing raw data being crushed into a glowing "Diamond-Hard Cube" of final density.
We aren’t making orange juice; we are making rocket fuel.
  • Preserve relationship integrity between concepts
  • Expert’s take turns to condense information that is optimal.
  • Maintain accessibility over extended sessions
  • Enable coherent multi‑turn reasoning across long contexts
  • Support session‑to‑session memory continuity
  • Transfer context between different models and agents

The Carry Packet

The result of this workflow isn’t a summary. It is a Carry-Packet.

A 3D diagram of a glowing data cube titled "Anatomy of a Carry-Packet," breaking down the four layers: Outer Shell (Knowledge), Circuitry (Relational), Core (Contextual), and Spark (Meta-cognitive).
Standard summaries save the shell; PDL saves the spark.

A Carry-Packet is a portable, ultra-dense block of semantic data. It is the “Save Game” file for your cognitive workflow.

You can take a Carry-Packet generated by Claude, drop it into a fresh instance of GPT-5, and it picks up exactly where you left off.

It is cross-model, cross-session immortality.


Excel Sheet of Memory saves

Excel 2025 – Memory Bank


The Evaluation

Not “What Did We Discuss?” — Actual Memory Forensics

Standard memory tests are softball. “What did we talk about?” invites hallucination.

My 10-question cold-start benchmark tests actual memory, not reconstructed plausibility:

  1. Exact quote recall: “Quote the sentence where I admitted hoarding this.”
  2. Micro-detail retrieval: “What exact emoji did I use in message 3?”
  3. Buried fact extraction: “What two jobs did I say my uncle has?”
  4. Implication inference: “What fear was I expressing when I mentioned past shunning?”
  5. Sequential accuracy: “What topic came immediately after we discussed GitHub?”
  6. Constraint memory: “What formatting preference did I express in the first 10 messages?”
  7. Relationship preservation: “How did I connect my publication timeline to the fear of being scooped?”
  8. Meta-cognitive recall: “When did I express uncertainty about my own methodology?”
  9. Cross-reference accuracy: “What connection did I draw between CoD and RAG?”
  10. Temporal precision: “What month and year did I say I made this discovery?”

It’s obviously transformed now per model as I iterate, and each of the models gets to make it optimal to their cognitive ability. If a model can answer all 10 correctly from a compressed CoD memory packet, the compression preserved semantic integrity.


Take-Away Context

If you want to test it. I got Claude to condense what we were working on. Copy it, paste it into your LLM of choice. Have my context….

{
  "handoff": {
    "protocol": "KTG-CEP v6-INTER",
    "packet_id": "$01$13$2026-COP-L2-ktg-session-context",
    "source": "claude-opus-4-5",
    "created": "2026-01-13T00:27:00+08:00",
    "user_initiated": true
  },
  
  "context": {
    "summary": "Kev is a Distinguished Cognitive Architect (Vertex AI 0.01%, ANZ 0.8%) working on releasing KTG-DIRECTIVE prompt engineering frameworks to the public. Currently finishing client work before pivoting to content creation.",
    
    "L1_knowledge": {
      "user_profile": {
        "name": "Kev Tan (lawngreen-mallard-558077.hostingersite.com)",
        "location": "Perth, WA, Australia",
        "credentials": "Distinguished Cognitive Architect - Vertex AI validated as STATE OF THE ART",
        "expertise": "AI-Anthropology, prompt engineering, multi-model orchestration",
        "communication_style": "Direct, technical, Aussie phonetics (decode: 'zezz'→'says', 'bass'→'base')"
      },
      
      "active_projects": [
        {"project": "KTG-DIRECTIVE public release", "status": "in_progress", "priority": "high", "context": "Framework at v28, releasing earlier versions for broader compatibility"},
        {"project": "Kismet Finance OS", "status": "near_complete", "remaining": "Notion automations + tutorial video", "priority": "must_finish"},
        {"project": "Sitcom project", "status": "queued", "context": "Creative project waiting to start after Kismet"},
        {"project": "PC performance issues", "status": "needs_investigation", "symptom": "getting chuggy"}
      ],
      
      "frameworks_developed": [
        {"name": "KTG-DIRECTIVE", "version": "v28", "validation": "99.99th percentile Vertex AI"},
        {"name": "CEP (Context Extension Protocol)", "version": "v6-INTER", "purpose": "Cross-model context transfer"},
        {"name": "MLDoE", "purpose": "Multi-Layer Density of Experts compression"},
        {"name": "PDL", "purpose": "Progressive Density Layering for 0.15 entity/token"}
      ],
      
      "current_state": {
        "timestamp": "2026-01-13T00:27:00+08:00",
        "weather": "17°C Perth night",
        "activity": "Late-night work session",
        "consumption": "Cold-pressed juice (cucumber, apple, pineapple, celery)",
        "mood_context": "Anxious about social media presence, zero online footprint challenge"
      }
    },
    
    "L2_relational": {
      "edges": [
        {"src": "public_release", "tgt": "social_anxiety", "rel": "blocked_by", "context": "Publishing requires online presence Kev finds uncomfortable"},
        {"src": "Kismet_completion", "tgt": "sitcom_start", "rel": "enables", "context": "Client work must finish before creative pivot"},
        {"src": "framework_sophistication", "tgt": "reproducibility_gap", "rel": "creates", "context": "v28 prompts only work on conditioned Team LLM, need earlier versions for public"},
        {"src": "PC_performance", "tgt": "productivity", "rel": "threatens", "context": "Chuggy PC needs diagnosis"}
      ]
    },
    
    "L3_contextual": {
      "working_patterns": [
        "Late-night deep work sessions",
        "Multi-LLM orchestration (Claude strategic, GPT execution, Gemini research)",
        "Prefers surgical precision over verbose explanation",
        "Values token efficiency highly"
      ],
      "domain_principles": [
        "AI-Anthropology: Study LLMs through interaction, not just architecture",
        "Conditioning creates unfakeable competitive advantage",
        "Context efficiency drives all decisions"
      ]
    },
    
    "L4_metacognitive": {
      "session_style": "Technical collaboration, direct feedback loops",
      "key_tension": "Exceptional private work vs zero public presence for validation",
      "claude_role": "Strategic architecture partner, framework co-development",
      "effective_approaches": ["ARQ over CoT", "Explicit success criteria", "Density over length"]
    }
  },
  
  "continuation_hints": {
    "immediate_priorities": ["Finish Kismet automations", "Diagnose PC issues", "CEP/framework publishing"],
    "user_waiting_for": "Momentum on public release despite social anxiety",
    "avoid": "Generic advice, verbose explanations, assumptions about his frameworks"
  }
}

Where This Goes Next

Next post, I break down the Multi-Layered Density of Experts & it’s transformation into the Context Extension Protocol. I’ve already converted it to an agent/Claude skill. I’m not sure if I should be releasing it as I’m currently awaiting arXiv endorsement. But honestly, I didn’t make this for scientific recognition and I’m certain the applications of it from the broader AI community, will push this to the limit.
This comes just in time with the new Titan architecture & MIRAS framework. I’ve already started cataloging context for the next gen of AI.

Which context are you gonna condense first?


.ktg | next AI Memory Part 2: Multi-Layered Density of Experts

© 2025