How AllSource Core Works: WAL, Parquet, and DashMap

AllSource Core is a purpose-built event store written in Rust. It achieves 469K events/sec ingestion and 11.9us p99 queries through three storage layers: a Write-Ahead Log for crash recovery, Parquet files for long-term persistence, and DashMap for in-memory reads. This post explains how each layer works and why we chose this architecture over PostgreSQL.

The three-layer storage model

Write path:  Event → WAL (fsync) → DashMap (memory) → Parquet (periodic flush)
Read path:   Query → DashMap (11.9us) → done
Recovery:    Startup → Parquet (bulk load) → WAL (replay delta) → DashMap (ready)

Every event touches all three layers. The WAL ensures durability (fsync to disk), DashMap enables fast reads (lock-free concurrent hash map), and Parquet provides long-term columnar storage (Snappy compression, analytical queries).

This is not "in-memory only." This is not "data lost on restart." Events survive process crashes, machine reboots, and disk failures (when backed by replicated volumes like Fly's persistent storage).

Layer 1: Write-Ahead Log (WAL)

The WAL is the durability guarantee. When an event arrives:

Serialize the event to bytes
Compute a CRC32 checksum over the bytes
Write [length][checksum][bytes] to the WAL file
fsync() to disk (configurable interval, default 100ms)
Only then return 200 to the caller

The CRC32 checksum detects corruption — if a WAL entry's checksum doesn't match its payload during recovery, we skip the corrupted entry and log a warning. This catches bit-rot, partial writes from crashes, and filesystem corruption.

// Simplified WAL write (actual code in infrastructure/persistence/wal.rs)
pub fn append(&mut self, event: &Event) -> Result<()> {
    let bytes = bincode::serialize(event)?;
    let checksum = crc32fast::hash(&bytes);
    let len = bytes.len() as u32;
 
    self.file.write_all(&len.to_le_bytes())?;
    self.file.write_all(&checksum.to_le_bytes())?;
    self.file.write_all(&bytes)?;
 
    // fsync on interval, not every write — 100ms default
    if self.should_sync() {
        self.file.sync_data()?;
    }
    Ok(())
}

The fsync interval is a durability/performance trade-off:

100ms (default): at most 100ms of events lost on power failure. Good enough for most use cases.
0ms (every write): zero data loss, but throughput drops ~10x.
1s: higher throughput, up to 1s of potential loss.

For financial use cases or audit trails, set ALLSOURCE_FSYNC_INTERVAL=0. For IoT telemetry where occasional loss is acceptable, 1s is fine.

Layer 2: DashMap (in-memory concurrent reads)

After the WAL write, the event is inserted into a DashMap — a lock-free concurrent hash map from the dashmap crate. This is where reads come from.

DashMap uses sharded RwLocks internally. Read operations never block write operations and vice versa (on different shards). This is why AllSource achieves 11.9us p99 query latency — reads are pure memory lookups, not disk I/O.

// Simplified query (actual code in store.rs)
pub fn query(&self, filter: &QueryFilter) -> Vec<Event> {
    self.events
        .iter()
        .filter(|e| filter.matches(e))
        .take(filter.limit)
        .map(|e| e.clone())
        .collect()
}

The trade-off: all events must fit in memory. With 164K events on a 1GB VM (our current production deployment), each event averages ~6KB including metadata. For a million events, you'd need ~6GB of RAM. This is why AllSource's pricing tiers are event-count-based — the cost of the service is proportional to the memory required.

Layer 3: Parquet (columnar persistence)

Periodically (default: every 4 hours or 10K events), the event store flushes a Parquet checkpoint:

Snapshot all events in the current window
Write them to a Parquet file with Snappy compression
Truncate the WAL (events are now safely in Parquet)

Parquet files are columnar — meaning queries that filter on event_type or tenant_id only read those columns, not the entire event payload. This makes analytical queries fast even on large datasets.

At startup, Core loads events from Parquet first (bulk load), then replays any WAL entries that were written after the last checkpoint. This gives you fast recovery: the Parquet load is a single sequential read, and the WAL replay is typically small (only events since the last checkpoint).

Why not PostgreSQL?

We get this question a lot. Here's the concrete comparison:

Metric	AllSource Core	PostgreSQL
Ingestion throughput	469K events/sec	~10K inserts/sec (with indexes)
Query latency (p99)	11.9us	500us-5ms (depending on indexes)
Storage per event	~1KB (Snappy compressed Parquet)	~2-4KB (row storage + indexes + MVCC overhead)
Crash recovery	WAL replay + Parquet load (~2s for 164K events)	WAL replay (~10-60s depending on checkpoint frequency)
Concurrent reads	Lock-free (DashMap shards)	MVCC (snapshot isolation, can block on heavy writes)

PostgreSQL is a general-purpose database designed for mutable state. AllSource Core is a purpose-built event store designed for append-only, immutable events. The append-only constraint lets us make optimizations that PostgreSQL can't:

No MVCC overhead: events are immutable, so we never update or delete
No index maintenance: DashMap IS the index — O(1) lookups, no B-tree rebalancing
No vacuum: no dead tuples, no bloat, no autovacuum pauses
No WAL amplification: each event is written once, not twice (data + WAL) like PostgreSQL

The durability guarantee

When POST /api/v1/events returns 200, your event is:

Written to the WAL with CRC32 checksum
In the DashMap (queryable immediately)
Pending Parquet flush (will be checkpointed within the configured interval)

If the process crashes after step 1, the event survives — WAL replay recovers it on the next startup. If the disk fails, the event is lost only if both the WAL and the Parquet checkpoint are on the same disk (use replicated volumes for production).

We run a durability test that writes events, kills the process, restarts, and verifies all events survived. It passes. Every time.

What this means for your agent

If you're using AllSource as memory for an AI agent:

Every observation, decision, and action is durable. The agent's history survives restarts, deploys, and crashes.
Time-travel is instant. Querying "what did the agent know at 3pm yesterday?" is an 11.9us DashMap lookup, not a log grep.
The audit trail is automatic. Every event has a timestamp, checksum, and provenance — you get SOC2-grade audit trails as a side effect of the storage model.

This is why we say AllSource Core IS the database. It's not a cache in front of PostgreSQL. It's not an in-memory store that loses data on restart. It's a purpose-built, durable, high-performance event store.

Start at all-source.xyz or read the API docs.