featuresMay 11, 2026· 3 min read

CDEDUP: collapsing the answers that mean the same thing

A semantic cache slowly accumulates near-duplicate answers. CDEDUP finds the clusters that mean the same thing and collapses them, reclaiming memory — and Crowkis is honest about its cost.

Over time a semantic cache fills with near-duplicates: a dozen phrasings of the same question, each stored as its own entry pointing at substantially the same answer. That's memory spent storing the same knowledge many times. CDEDUP runs semantic deduplication — it finds the clusters of entries that mean the same thing and collapses them, reporting the clusters found and the memory reclaimed.

In plain words: A semantic cache piles up many rewordings of the same answer. CDEDUP merges the duplicates to free memory — best run as off-peak maintenance, since a big pass is heavy.

The mechanism reuses the cache's own intelligence: the embeddings already exist, so dedup is a clustering pass over vectors it has, not a fresh round of embedding. The bounded, batched path caps how much one run processes, so a maintenance pass doesn't try to fold the entire store in a single breath.

the budget wall, enforced locally

The wall is enforced before the invoice, not discovered on it.

We're also honest about where this is today: on a loaded instance, a dedup pass is heavy, and because it runs on the single-writer actor it can block live traffic while it works. The guidance is straightforward — treat CDEDUP as scheduled, off-peak maintenance rather than a hot-loop operation, and let the bounded path keep each run's footprint in check.

The bottom line

Dedup is housekeeping, and good housekeeping is scheduled, not constant. CDEDUP reclaims the memory that semantic variety quietly consumes — run it on a cron, off-peak, and the cache stays lean without standing in front of your users.