One signed Docker image. Every feature compiled in. Free to run. docker pull crowkis/crowkis:latest
← back to the Roost
engineeringMay 29, 2026· 3 min read

The write-ahead log: how the cache survives a kill -9

Durability isn't a checkbox — it's a sequence of writes in the right order with checksums at every step. Here's the boring machinery that makes restarts uneventful.

Caches traditionally shrug at durability — 'it's just a cache' — but an LLM cache's contents cost real money to rebuild. Losing a warm corpus to a pod reschedule means re-purchasing it from your provider at full price. So CrowkisDB treats every entry as worth keeping: writes land in the write-ahead log before anything else, each record framed with a CRC32 checksum, fsynced according to policy.

In plain words: Everything is written to a crash-proof journal first. Pull the plug mid-write and Crowkis replays the journal on restart — your cache comes back exactly as it was.

Recovery is the payoff: on startup the engine replays the log from the last checkpoint, validating every checksum, rebuilding the memtable to the exact pre-crash state. A torn write at the tail — the classic power-cut artifact — fails its checksum and truncates cleanly instead of poisoning the replay. The vector index recovers alongside, so semantic search resumes where it stopped.

the write-trust pipeline

Five stages score every write before it can ever be served.

We don't trust this machinery; we attack it. The integration suite kills the process mid-write and verifies every acknowledged entry survives; the Docker smoke test does the same to the whole container. Durability claims that aren't tested by murder are marketing.

The bottom line

The result is operationally liberating: deploys, reschedules, and crashes are non-events. The cache you had is the cache you have. Boring, by design, provably.