We set out to build an agent memory system that beats zer0dex on cross-domain recall while adding what it can't do: temporal reasoning, full provenance, and auto-generated indexes.
This is the technical story of how we got there.
The Insight: Everything Is an Event
The key architectural decision: vectors, graph nodes, graph edges, and domain metadata are all events in the same WAL.
┌─────────────────────────────────────────────┐
│ AllSource Prime │
│ │
│ prime.node.created → Graph projection │
│ prime.edge.created → Adjacency projection│
│ prime.vector.stored → HNSW projection │
│ recall.index.updated → Compressed index │
│ │
│ ┌──────────────────────────────────────┐ │
│ │ WAL + Parquet + DashMap │ │
│ │ Single durability layer │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
This means:
- Time-travel is free. Query any entity's state at any past timestamp — it's just "replay events up to timestamp T."
- Provenance is free. Every mutation is an immutable event with a timestamp and optional metadata.
- Consistency is free. No distributed transaction across vector DB + graph DB + event store. One WAL, one fsync.
The Projection Pattern
Projections are materialized views computed incrementally from events:
pub trait Projection: Send + Sync {
fn name(&self) -> &str;
fn process(&self, event: &Event) -> Result<()>;
fn get_state(&self, entity_id: &str) -> Option<Value>;
fn clear(&self);
fn snapshot(&self) -> Option<Value> { None }
fn restore(&self, _snapshot: &Value) -> Result<()> { Ok(()) }
}Every projection gets the same event stream. Each maintains a different index:
| Projection | What it indexes | Query cost |
|---|---|---|
| NodeState | node properties by entity_id | O(1) DashMap lookup |
| AdjacencyList | outgoing edges per node | O(1) DashMap lookup |
| VectorIndex | HNSW over embeddings | O(log n) approximate NN |
| DomainIndex | nodes grouped by domain | O(1) DashMap lookup |
| CrossDomain | edges spanning domain boundaries | O(1) DashMap lookup |
| CompressedIndex | auto-generated markdown TOC | O(1) cached |
Adding a new index means implementing Projection — the event store replays existing events through it on registration.
The Compressed Index
zer0dex's big insight: vector similarity finds X or Y when you ask "how does X relate to Y?" — rarely both. A structured markdown index bridges this gap.
Our version auto-generates from projections:
pub fn build_heuristic_index(summary: &IndexRawSummary) -> String {
// Organized by domain, sorted by node count
// Cross-references section shows domain pairs + relation types
// Token budget: 500-2000 adaptive
}The index updates as events flow in. No manual maintenance.
SOLID Refactoring
We started with a CompressedIndexProjection that maintained its own domain maps — duplicating what DomainIndexProjection and CrossDomainProjection already tracked. Three projections independently maintaining the same node_id → domain HashMap.
The refactor:
- Extracted
LlmBackendtrait —OllamaBackendimplements it; adding OpenAI is a new struct, not a modification - Removed duplicated state —
build_raw_summary()reads from existing projections, doesn't maintain its own copies - DI constructor —
RecallEngine::with_dependencies()for testing with mock backends reqwest::Clientreuse — stored as field, created once instead of per-request
Why Rust
Three reasons, all pragmatic:
-
Single binary deployment.
cargo install allsource-prime— no runtime, no Docker, no Python environment. The MCP server is a 30MB static binary. -
DashMap for concurrent projections. Multiple projections process the same event stream concurrently. DashMap's sharded lock-free reads give us 12μs query latency without unsafe code.
-
The Projection trait composes. Rust's trait system lets us mix projections freely — add a new index by implementing a trait, register it, and the store replays events through it. No reflection, no runtime cost.
What's Next
The compressed index is the foundation. Next:
- LLM-assisted summarization — use a local model to generate natural-language index from the heuristic scaffold
- Benchmark publication — LoCoMo and LongMemEval scores with real embeddings
- WASM target — run Prime entirely in the browser for interactive demos
The code is at github.com/all-source-os/all-source. The examples are in apps/core/examples/.

