CTHINK and CREUSE: banking a chain of thought and replaying it
The reasoning is the expensive part of a hard answer. CTHINK stores a chain-of-thought trace as a reusable step graph; CREUSE fetches the matching plan for a new query at a fraction of the original token cost.
On a genuinely hard query, the tokens go into the thinking — the step-by-step derivation, the plan, the structured analysis — far more than into the final sentence. Response caching can't touch that cost, because two questions with different specifics produce different final answers even when the reasoning is identical. CTHINK and CREUSE cache the reasoning itself.
CTHINK takes a chain-of-thought trace and stores it as a step DAG — the typed sequence of reasoning steps with the specifics abstracted into slots. CREUSE takes a new query, matches its structure against stored skeletons, and returns the reasoning plan that fits, ready for the new specifics to be substituted in. The reuse runs at roughly fifteen percent of the original token cost, because you're recomposing a proven structure, not re-deriving it.
Reuse only when meaning, structure, confidence, and trust all agree.
The worked-example intuition makes it land: the way you solve one amortization problem is the way you solve all of them; the troubleshooting tree for one error class fits the next instance with new values. First solve pays full reasoning tokens; every structural sibling after pays a recomposition. It's gated by the same confidence machinery as everything else, so a skeleton only serves where the match clears the bar.
The bottom line
No competitor caches the thinking — they cache the conclusion. Reusing reasoning is the deepest saving in the product precisely because it lives in the step between question and answer, where the tokens actually pile up.