Fallback routing: surviving your provider's bad day
Providers have incidents; your product doesn't have to. Health-aware backend routing plus a warm cache turns upstream outages into degraded modes users barely notice.
Every LLM provider has status-page afternoons, and a single-provider architecture inherits all of them as its own outages. The federation layer in Crowkis registers multiple backends — hosted rivals, your local vLLM, whatever — with health tracking, and routes around the unhealthy automatically. Provider incident becomes routing event.
The cache changes the math of degradation in a way pure gateways can't: during an outage, your entire warm corpus keeps serving at full speed regardless of upstream health. The repeated head of traffic — the majority, in mature deployments — doesn't even notice the incident. Only novel queries feel the fallback, and they get the backup backend instead of an error.
The upgrade is a workflow, not a leap of faith.
Cross-provider consistency comes from the same machinery as everywhere else: fallback answers face the same write gates, and Enterprise's cache bridge means answers banked from the primary serve equivalent queries during the fallback window — no split-brain corpus, no quality whiplash.
The bottom line
Resilience reviews usually price redundancy in standby compute. A warm semantic cache is the cheapest redundancy you'll ever buy: it's the only failover asset that was already paying for itself before the incident.