Skip to content
vollko
Main
Homepage Engineering Transformation Whitepaper OSS catalog
The trace · deep dives
01 · sense
sensing-ingestion
02 · substrate · memory & identity
knowledge-graphs agent-memory agent-identity observability
03 · cognition · the firm thinks
agent-frameworks orchestration eval-harness protocols
04 · trust + learning
governance feedback-loops
05 · synthesis · one trace
end-to-endStart a conversation
AI-native · cognition

Memory, tiered.

Working, hot, semantic, procedural, episodic. Five tiers, one taxonomy, surviving every agent rotation.

AGENT 01 · WORKING in-context · per-call 02 · HOT · retrieval vector + cache · seconds 03 · SEMANTIC · graph + ontology entities · relationships · canonical 04 · PROCEDURAL · repo as source of truth prompts · workflows · evals · policies 05 · EPISODIC · event log + transcripts audit-grade · append-only · the long memory
Section 01 · the wrong way vs the tiered way

"Just embed everything" vs five-tier memory.

same agent · two answers to "where do you put it?"
JUST EMBED EVERYTHING one undifferentiated pile cannot answer "as of when?" · cannot version · cannot audit FIVE TIERS 01 working · in-context 02 hot · vector + cache 03 semantic · graph 04 procedural · repo 05 episodic · event log each tier has its own owner format · lifecycle · eviction policy · access pattern
Memory is not one thing. Mix the tiers in your head, mix them on disk, ship a regression no one can find.
Section 02 · tier 01

Working memory - the scratchpad.

context window assembled per call
WHAT THE AGENT SEES, EVERY TIME IT THINKS STANDING INSTRUCTIONS small WHAT IT CAN DO medium RELEVANT FACTS PULLED IN biggest piece WHAT WE KNOW ABOUT THIS CASE small THE CURRENT CONVERSATION medium limited shelf space · packed precisely on every call
Built fresh per call, dies at return. Nothing persists here that isn't also written to the lower tiers.
Section 02b · context is a budget

Select, compress, route - don't dump.

a 32k token window, allocated by job not by habit
32k tokens system prompt~5% tool definitions~12% retrieved chunks~35% graph facts~10% conversation history~20% headroom~18% substrate operations: · select what's relevant · compress when it overflows · route to the right tier
Performance depends less on how much context you give and more on how precisely it is shaped.
Section 03 · tier 02

Hot memory - the retrieval tier.

source → chunk → embed → upsert · queried in < 50ms
SOURCE doc / transcript CHUNK ~500 tokens EMBED 1024-dim vector UPSERT pgvector / Qdrant VECTOR STORE "customer X tier?" query · top-10 in ~50ms
Pair it with BM25 + a reranker. Pure vector recall is the floor, not the ceiling.
Section 04 · tier 03

Semantic - the ontology.

canonical entities · relationships · "customer means this"
CUSTOMER id, name, tier CONTRACT id, valid_to ACCOUNT id, owner PERSON id, email CLAUSE id, kind signs > held by > references > contains > when an agent says "customer", this is what the firm means
Cross-referenced from the Knowledge graphs deep-dive. The semantic tier is where the graph lives.
Section 05 · tier 04

Procedural - the repo.

prompts, workflows, evals, policies · git is the database
agent-config/ ├── prompts/ │ ├── customer-triage.md │ └── finance-close.md ├── workflows/ │ └── escalation.ts ├── evals/ │ └── golden-set.jsonl └── policies/ └── action-gate.cedar CHANGE = PR · commit & push · eval suite runs on CI · reviewer signs off · merge → SHA pinned · agents pull config by SHA · rollback = revert no prompt edited through a chat window
If your prompts live in a database, no one can audit them. In Git, they're as reviewable as a function.
Section 06 · tier 05

Episodic - the long memory.

append-only event log · every action ever · the audit-grade record
EVENT LOG append → evt_001 sense.docs 07:32:11 evt_002 substrate. 07:32:12 evt_003 cognition. 07:32:14 evt_004 action.exec 07:32:18 evt_005 feedback. 07:44:02 evt_006 eval.score 19:00:00 ... ask: 'show me everything that happened on ticket #88241' → get the full story
Never deleted, only redacted. A regulator gets evidence, not narrative.
Section 07 · rotation

A new agent inherits all five tiers.

onboarding = config commit, not knowledge transfer
AGENT v7 retired AGENT v8 new 01 WORKING built fresh per call · no inheritance needed 02 HOT shared substrate · new agent retrieves same chunks 03 SEMANTIC firm's ontology · identical for every agent 04 PROCEDURAL repo at v8 SHA · the only thing that actually changed 05 EPISODIC history to train on, evaluate against, audit from
Working memory is the only ephemeral tier. The other four are the firm's, not the agent's.
Section 08 · dreaming · offline consolidation

The agent sleeps on it.

Anthropic Memory tool · 6× completion lift at Harvey · 97% first-pass-error drop at Wisedocs
AWAKE · live sessions /memories project-acme/ scratch-2026-05-25.md scratch-2026-05-24.md half-finished-plan.md customer-quirks-rough.md ... raw notes · one file per turn DREAM nightly DREAM JOB 1 · read raw notes 2 · read past transcripts 3 · cluster + dedupe 4 · rewrite cleaner 5 · diff for review 6 · commit a second agent · same fs tools AWAKE · next morning /memories project-acme/ CUSTOMER-QUIRKS.md ACTIVE-PLANS.md ARCHIVE.md organized · deduped · indexed SIX FILE OPS view · create · str_replace · insert · delete · rename same toolchain as code execution · no special memory API EVERY WRITE = AUDIT EVENT session-id · tool · diff · redact-flag · rollback-handle memory is an auditable surface, not a black box
Awake writes are cheap and messy. The dream rewrites them cleaner. Every edit is logged - replay-able, redact-able, roll-back-able.
Section 09 · the shared notebook

You write once. Every teammate's agent reads.

Kindred · KAF 0.1 · closes OWASP AST07 (update drift) + AST09 (no governance)
agent A agent B agent C agent D KINDRED PAGE · signed (KAF 0.1) title: "how we structure migrations" author: "lina · sales-eng-DRI" bless: 7 / 10 members · ✓ touched: 2026-05-22 (4 days ago) expires: 2026-08-22 (90 days) sig: ed25519 · X-Agent-Pubkey retrieved with provenance · not regenerated NETWORK HEALTH · 4 retrieval utility success rate · top-1 precision · mean rank time to first useful p50 / p90 from join to first reported hit trust propagation p50 / p90 from publish to threshold blessing staleness cost shadow hits + soon-to-expire returns over 7 days LIFECYCLE write bless retrieve report-wrong stale @ 90d expire
N agents re-deriving the team's standards = N answers. Kindred is the shared substrate - write once, retrieve with provenance, decay if untouched.
Section 10 · tools 2026 · OSS-first

Where the OSS picks land.

temporal ↑↓ non-temporal  ·  OSS ←→ SaaS
SaaS OSS temporal-aware non-temporal Graphiti LongMemEval #1 (63.8%) Letta Apache 2.0 · absorbed MemGPT Cognee MIT · 6 LOC start Mem0 +29.6pts temporal Apr 2026 Zep Cloud skip · Graphiti OSS covers memory-portability vollko · flagship schema
Top-left quadrant is where the long-running agent lives. Pick by question shape.
Section 09b · the portable schema

Move memory between runtimes. Without loss.

memory-portability · one JSON schema · three primitives match this page 1:1
Mem0 LangChain LlamaIndex OpenAI Memory Hermes · custom adapters PORTABLE BUNDLE .mempack.json episodic [ ... ] semantic [ ... ] procedural [ ... ] + provenance + optional embeddings + consent flags Mem0 LangChain LlamaIndex OpenAI Memory your runtime CLI: export · import · merge · forget
The model layer is liquid. Don't lock the user at the memory layer.
Section 11 · vollko OSS · this layer

The primitives.

· · ·
Build the AI-native firm