347 tests and a murder weapon: how the suite is organized
Bottom-heavy by design: the layers that hold your data get the most hostile coverage, and the smoke suite's signature move is killing the process to prove a point.
The suite's shape encodes a belief: bug severity scales with stack depth, so coverage should too. The base layers get the bulk — 64 tests on KV operations, 37 on the engine's WAL replay, flush, compaction, and batches — because a storage bug costs data, and data here costs literal dollars to rebuild. Above them, vector persistence, semantic gating, and 18 stress scenarios cover the intelligence; the smoke suite crowns it end to end.
The smoke suite's character is adversarial: it writes entries, kills the container without ceremony, restarts, and demands everything back; it probes the management API unauthenticated and expects rejection; it checks that gRPC and RESP agree about the same cache and that migration leases survive the violence. Happy paths are table stakes — the suite specializes in unhappy ones.
Five stages score every write before it can ever be served.
Every release passes the identical gate: full suite, Docker build, boot, health, auth boundary, durability drill. The rule has no exception process, which is the only kind of rule that survives deadline pressure. A release that lowers any bar isn't a release; it's a regression with version numbers.
The bottom line
Test counts are a vanity metric; test hostility isn't. Ours are hostile where your data lives, which is the only place hostility pays.