Skip to content
vollko
Main
Homepage Engineering Transformation Whitepaper OSS catalog
The trace · deep dives
01 · sense
sensing-ingestion
02 · substrate · memory & identity
knowledge-graphs agent-memory agent-identity observability
03 · cognition · the firm thinks
agent-frameworks orchestration eval-harness protocols
04 · trust + learning
governance feedback-loops
05 · synthesis · one trace
end-to-endStart a conversation
AI-native · cognition

Knowledge
graphs, drawn.

The build process, the retrieval funnel, the temporal model - one diagram per concept. Everything below is the picture, not the prose.

signs owns held by manages audited by reports to contains CUSTOMER CONTRACT ACCOUNT PERSON AUDITOR CLAUSE
Section 01 · the difference

Vector store vs knowledge graph.

same query · two answer shapes
"contracts expiring soon" VECTOR STORE top-5 closest chunks of TEXT KNOWLEDGE GRAPH CUSTOMER MSA-031 2026-07 SOW-104 2026-09 NDA-22 2029-01 has > has > has > filter < Q3 2 structured CONTRACT nodes
A vector store finds documents about contracts. A graph returns the contracts themselves.
Section 02 · how a fact becomes a fact

Document triple canonical node.

two-pass extraction
SOURCE contract.pdf PASS 1 · LLM open extraction "give me (s, p, o, conf)" FACTS EXTRACTED Atlas Logistics signed contract MSA-031 contract MSA-031 ends April 30, 2027 Atlas Ltd. (low confidence) same company? - sent to review PASS 2 · CANONICALIZE resolve to ontology CUS CON signs + predicate map conf < 0.7 → drop
Pass 1 says it. Pass 2 binds it to the firm's ontology. Anything that cannot bind, gets parked.
Section 03 · where it lives

Three indexes, one knowledge API.

co-located storage, unified front
knowledge.lookup(entity, kind, intent) unified knowledge API 01 · GRAPH typed nodes + edges Neo4j · Memgraph · KuzuDB 02 · VECTOR embedded chunks pgvector · Qdrant · LanceDB 03 · KEYWORD BM25 / inverted contract → d12,d34,d88 MSA-031 → d12 invoice → d04,d77 amend → d34,d88 renewal → d12,d77 SOW-104 → d22 audit → d04,d88 Tantivy · OpenSearch · tsvector
Each index answers a different question shape. The API picks; the agent never knows which fired.
Section 04 · the retrieval funnel

From 400 candidates to one answer.

three stages · ~150ms p50
01 · CAST A WIDE NET fetch 400 possibly-relevant snippets in parallel ~ 20ms keep the 100 best 02 · RERANK FOR RELEVANCE a second model reads each one carefully ~ 60ms keep the 10 best 03 · FOLLOW CONNECTIONS (if needed) walk the graph to pick up related facts only if the question crosses multiple things ~ 100ms 10 chunks + 6 graph facts → agent context
Skip a stage, lose 5-15 points NDCG. This is the 2026 consensus stack.
Section 05 · bitemporal facts

Every fact has a validity window.

"what did we believe on March 1?"
JAN FEB MAR APR MAY JUN as-of MAR 1 tier = Gold active tier = Platinum active (after MAR 1) MSA-031.valid_to = 2027-04 MSA-031.valid_to = 2028-04 active after renewal account.owner = Alice account.owner = Bob contact.email = a@x.com valid supersedes earlier superseded
The graph never forgets. Move the as-of marker; the answers move with it.
Section 06 · the schema, drawn

Small ontology, edge properties.

a working firm-scale ontology · 20-60 entity types is plenty
signs > held by > references > contains > paid by > manages > flagged > CUSTOMER id, name, tier CONTRACT id, valid_to ACCOUNT id, owner PERSON id, email CLAUSE id, kind INVOICE id, amount TEAM id, name INCIDENT id, sev EDGE ZOOM-IN CUS CON signs at 2026-03-15 by u_alice source doc_88723 conf 0.98 valid_to 2028-04-30 core (always present) domain (per-firm) required relationship
Properties live on edges, not just nodes. That's why a property graph beats RDF for agents.
Section 07 · tools 2026

Where the OSS picks land.

temporal ↑↓ static  ·  schema-free ←→ schema-strict
schema-strict schema-free temporal static free · temporal strict · temporal free · static strict · static Graphiti LongMemEval #1 Cognee MIT · in < 10 LOC KAG +19.6% F1 HotpotQA LightRAG ~10k token prompts HippoRAG 2 PPR walks Neo4j Agentic auto-schema MS GraphRAG skip · 100k+ token prompts memory-portability vollko · portable schema MoltSchool / Kindred vollko · signed network
Pick by question shape. If you ask "as of when?", you need the upper half.
Section 08 · ways to ship a bad KG

Five anti-patterns.

?
SCHEMA-FIRST
design before data
STATIC DUMP
extract once, rot
no t
NO TEMPORAL
overwrites history
"my KG"
VECTOR-ONLY
not actually a graph
A B LLM picks? at retrieval time
LLM-RESOLVES
conflict at runtime
Section 09 · vollko OSS · this layer

The primitives.

· · ·
Build the AI-native firm