Live · Routing 2.4M calls / day

Stop Overpaying for AI.
Relay routes every call to the cheapest model that works.

A drop-in proxy for OpenAI, Anthropic, and Groq. Relay classifies every request and ships it to the right model — Opus when you need it, Haiku when you don't. Zero code changes.

Users have saved

in API costs this quarter

⚡ Install in 60 seconds See how it works →

$ clawhub install nexusrelay/relay

Why Relay

Three features. One outcome: lower bills.

You don't need Opus for a one-line prompt. Relay figures out what you actually need and routes accordingly.

🎯

Smart Routing

Relay inspects every request — token count, tool usage, reasoning flags — and picks the cheapest model that can handle it. Fast tier for short prompts, premium tier only when justified.

📊

Real-time Analytics

Live dashboard shows every call: model requested vs. model used, tokens, savings per call. Know exactly where your AI budget goes — and what Relay saved you.

⚡

Zero Code Changes

OpenAI-compatible endpoint. Change your base URL, keep your SDK. Works with every agent framework, every language. Deploy it in a minute.

Under the hood

How Relay decides.

A deterministic heuristic engine. No magic, no surprises. Full routing logs for every call.

1

Request arrives

Your agent hits POST /v1/chat/completions — same format as OpenAI.
2

Classify

Token count, tool_use blocks, system prompt length, reasoning flags. Three tiers: Fast, Balanced, Premium.
3

Route

Fast → Groq Llama-3.3 or Haiku. Balanced → Sonnet or GPT-4o-mini. Premium → Opus or GPT-4.
4

Ship response

Provider response comes back untouched. Your agent never knows the difference — except on the invoice.

# Classification logic
if tokens < 500 and not has_tools:
    tier = "FAST"
    model = "llama-3.3-70b" # $0.0006/1k

elif tokens < 2000:
    tier = "BALANCED"
    model = "claude-sonnet-4" # $0.003/1k

else:
    tier = "PREMIUM"
    model = "claude-opus-4" # $0.015/1k

log(tier, model, tokens, savings)

Pricing

Pay for what you route.

Start free. Upgrade when Relay pays for itself (which is usually week one).

Starter

For solo builders kicking the tires.

Up to 10,000 calls / month
All three tiers enabled
Basic analytics (7 days)
Community support

Get started

Growth

$29/mo

For indie apps and small teams.

Up to 250,000 calls / month
30-day analytics retention
Custom routing rules
Email support

Start Growth

Operators don't pay for what they don't need.

We were burning $4k/mo on Opus for what turned out to be short classification calls. Relay cut that to $900 without touching a line of code.

Maya Ramirez

CTO · Fieldwire Agents

↓ 78% monthly spend

Installed Relay before standup. By end of the day it had rerouted 11k calls and saved us $312. That's a wrap.

Devraj Shah

Founding Eng · Loom for Sales

$312 saved day one

The dashboard alone is worth the price. I can finally see which agents are blowing the budget and which ones were fine running on Haiku the whole time.

Ana Kowalski

Platform Lead · Arbor AI

12 agents audited in week 1

FAQ

Questions we hear a lot.

Does Relay ever send my data somewhere new?

No. Relay routes your request to the same providers you already use (OpenAI, Anthropic, Groq). It doesn't log prompts or completions — only metadata (token counts, model used, latency). Self-hosted option available on Scale.

How does routing not break my agent's quality?

Relay only downgrades when it's safe — short prompts, no tools, no reasoning flags. Anything ambiguous stays on Premium. You can also pin specific agents or routes to a tier, or disable routing and just use Relay for observability.

What's the latency overhead?

Classification is pure JS — under 2ms. Network hop to Relay adds ~10-30ms depending on region. For streaming responses, Relay passes bytes through as they arrive, so time-to-first-token is basically unchanged.

Do I still use my own API keys?

Yes. You give Relay your provider keys once (encrypted at rest) and Relay uses them to call the providers on your behalf. Billing stays on your accounts. Relay charges a flat monthly fee — no per-token markup.

Can I write custom routing rules?

Yes, on Growth and up. Rules are JSON: match on agent id, endpoint, token count, or any header, and pin to a tier or a specific model. Everything else falls through to the default heuristic.

Is there a self-hosted version?

Yes — the MVP proxy is open-source and ships with the nexusrelay/relay ClawHub skill. The managed version adds dashboards, auth, retention, and SLAs.

Route your first call in 60 seconds.

Your agents are already sending requests somewhere. Send them through Relay instead — see the savings by lunch.

⚡ Install Relay View demo dashboard →

Three features. One outcome: lower bills.

Smart Routing

Real-time Analytics

Zero Code Changes

How Relay decides.

Request arrives

Classify

Route

Ship response

Pay for what you route.

Operators don't pay for what they don't need.

Questions we hear a lot.

Route your first call in 60 seconds.