Crowkis vs LangChain InMemoryCache: the default that quietly costs the most
One import gives you LangChain's in-memory exact cache. It's the caching equivalent of a sticky note — gone on restart, blind to paraphrase, local to one process.
set_llm_cache(InMemoryCache()) is one line, and one line feels like progress. Here's what the line buys: an exact-match dictionary in one Python process. Restart the pod — deploys do that — and it's empty. Scale to three replicas and each warms its own private copy. Phrase the question differently and it's a miss. It's caching as a gesture.
Gestures have real costs at LLM prices. The cache that forgets on every deploy re-purchases its entire contents continuously. The cache that can't see across replicas multiplies that by your replica count. The cache that needs exact bytes misses the basically-all-of-it fraction of traffic that arrives paraphrased. The line was free; the gaps bill monthly.
Five agents asking one question should cost one answer.
Crowkis replaces the gesture with infrastructure and keeps the one-line ergonomics: the SDK's get_or_compute wraps your model call, and behind it sits a shared, durable, semantic, trust-gated engine serving every replica and every app. Survives deploys, matches meaning, refuses poison, shows receipts.
The bottom line
Defaults are for demos. The moment a workload earns replicas, it has earned a real memory. Promotion takes five minutes.