Crowkis vs Kong AI Gateway: plugins are not engines
Kong added AI plugins to a great API gateway. A semantic-cache plugin in a proxy is a feature; a semantic cache engine is a product. The difference shows in production.
Kong is excellent infrastructure — if you already run it as your API gateway, its AI plugins give you provider routing and a semantic-cache plugin riding the proxy. The convenience is real. But examine the plugin architecture: the cache logic runs inside the gateway's request lifecycle, backed by external vector and KV stores, doing embed-compare-serve with a threshold. The hard 20% — the part that makes semantic caching production-safe — isn't in a plugin's budget.
That hard 20% is most of Crowkis: intent classes that change reuse rules per query type, structural templates that catch what cosine misses, geometric-mean confidence gating, five-stage write trust with a ledger, cost-aware eviction, freshness with version pinning, cache migration across model upgrades. Plugins wrap; engines decide.
Gateway at the boundary, engine behind it.
Operationally the split is clean and complementary. Keep Kong at the boundary doing gateway things — auth, routing, rate limits. Point its upstream at Crowkis. Now the proxy proxies and the cache caches, each owned by software whose entire existence is that one job.
The bottom line
The history of infrastructure keeps teaching the same seminar: the checkbox version of a hard problem works until traffic arrives. Caching LLM answers safely is a hard problem. Bring an engine.