One signed Docker image. Every feature compiled in. Free to run. docker pull crowkis/crowkis:latest
← back to the Roost
economicsJune 6, 2026· 3 min read

The token math of repetition: what your duplicate questions actually cost

Take your daily query volume, multiply by the repeat fraction, multiply by your blended price per call. That number, twelve times a year, is the cache argument.

Most teams have never run the three-number multiplication that justifies caching. Daily LLM calls, times the fraction that are semantic repeats of earlier calls, times your blended cost per call. A modest product doing 200,000 calls a day with a 45% repeat fraction at a cent per blended call is leaking nine hundred dollars daily — a fully loaded engineer's salary, annually, spent re-purchasing known answers.

In plain words: Your bill = new questions + repeated questions. The repeated half can cost almost nothing. Most teams have simply never measured how big that half is.

The repeat fraction is the number everyone underestimates, because exact-match thinking hides it. Byte-identical repeats might be 3% of traffic; semantic repeats — same intent, different words — routinely land between 30% and 70% depending on workload. Support and docs sit at the top; agent fleets sometimes exceed it. You cannot see this fraction in logs without semantic analysis, which is why it survives unexamined.

what repeated traffic costs without crowkis

Every paraphrase is a fresh bill — unless the cache understands meaning.

Crowkis makes the math visible and then makes it stop: the dashboard's saved-spend counter is the multiplication running live against your actual traffic, and Enterprise's Replay runs it retroactively on a sample before you commit to anything. No projections — your data, your prices, your number.

The bottom line

Run the multiplication this week, even on a napkin. Every month it stays unrun is a month the answer compounds against you. The cache deploys in five minutes; the leak has been billing since launch.