You're 45 minutes into a complex refactoring session. The AI agent has already verified 12 file operations, updated your tests, and is about to commit. Then you see it:

Error 429: Rate limit exceeded. Please try again in 60 seconds.

Problem Statement

Relying on a single LLM provider for autonomous agents is a single point of failure. When Anthropic goes down or rate limits you, your entire workflow stops.

This is the single-provider problem, and it's why we built the Mythos Orchestration Engine.

The Architecture

The orchestrator sits between your CLI and the model providers. Instead of hardcoding a single API endpoint, it manages a ranked pool of providers with real-time health tracking.

Here's the provider registration:

typescript

orchestrator.registerProvider(anthropicProvider, { priority: 0 });
orchestrator.registerProvider(deepseekProvider, { priority: 1 });
orchestrator.registerProvider(openaiProvider, { priority: 2 });

Priority 0 is your primary. The others are your safety net.

Layer 1: Adaptive Selection

Providers are scored using an Exponential Moving Average (EMA) of: - Success rate — weighted heavily (×100-150 depending on task type) - Latency — penalized for slow responses - Cost per 1K tokens — cheaper providers get a boost

Mythos Insight

By scoring providers in real-time, Mythos dynamically shifts traffic. If your primary model starts degrading in speed, a secondary model automatically takes over before a timeout ever occurs.

The highest-scoring healthy provider wins the request.

Provider	Base Latency	Cost (1M Tokens)	Orchestrator Rank
Claude 3.5 Sonnet	1.2s	$15.00	Primary
DeepSeek-V3	0.8s	$1.20	Secondary
GPT-4o	1.5s	$10.00	Tertiary

Layer 2: Retry with Backoff

If the selected provider fails with a retryable error (like Anthropic's OverloadedError), the engine retries with exponential backoff: 100ms → 500ms → 1000ms.

This handles transient blips without wasting a fallback slot.

Layer 3: Circuit Breaker

If all retries are exhausted, the provider is marked as "degraded" and put on a 5-minute cooldown. The request automatically falls through to the next provider in the ranked list.

Primary (Claude) → 429 → Retry 1 → Retry 2 → Retry 3 → TRIP BREAKER
↓
Fallback (DeepSeek) → ✔ Connection established. Streaming.

The stream continues. Your session doesn't break. Your context is preserved. You can read more about how this integrates with the broader system on the Mythos Engine Reference page.

Stream Watchdog

There's a subtler failure mode that most tools ignore: stream stalls. The connection is open, the API hasn't errored, but no tokens have arrived in 15 seconds.

The orchestrator runs an adaptive watchdog on every stream: - Default timeout: 15 seconds of silence - Adaptive: 3× the provider's average latency (EMA-tracked) - If the watchdog fires, the stream is aborted and the request is re-routed

This catches the "API is technically alive but functionally dead" scenario.

Deterministic Mode

For operations where you need exactly the same provider every time (like when a Skill forces claude-3-5-sonnet), the orchestrator supports deterministic selection. If you want to dive deeper into the code, check out the mythos-router repository on GitHub.

The Cost

Adding fallback providers costs you nothing until they're used. You only pay for tokens when a fallback actually fires. In practice, fallbacks trigger less than 5% of the time — but that 5% is the difference between a broken session and a saved one.

Setup

Add your fallback API keys to your environment:

bash

export ANTHROPIC_API_KEY="sk-ant-..."    # Primary
export DEEPSEEK_API_KEY="sk-..."          # Fallback 1
export OPENAI_API_KEY="sk-..."            # Fallback 2

Then run mythos-router as usual:

bash

npx mythos-router chat

The orchestrator detects available keys and registers providers automatically. If Claude goes down, DeepSeek picks up. If DeepSeek goes down, OpenAI picks up. Your work never stops.

Multi-Provider AI Routing: How to Never Hit a Rate Limit Again

The Architecture

Layer 1: Adaptive Selection

Layer 2: Retry with Backoff

Layer 3: Circuit Breaker

Stream Watchdog

Deterministic Mode

The Cost

Setup

Try mythos-router

The Architecture

Layer 1: Adaptive Selection

Layer 2: Retry with Backoff

Layer 3: Circuit Breaker

Stream Watchdog

Deterministic Mode

The Cost

Setup

Try mythos-router

Continue reading

Why AI Coding Tools Hallucinate File Paths (And How SWD Fixes It)

Claude 3.5 vs DeepSeek V3 vs GPT-4.5: Real CLI Benchmark Results