Docs

On this page

Quickstart
Authentication
Models
Examples
Error codes
Rate limits
OpenAI compatibility

OpenAI-compatible API over OpenAI, Anthropic, Google Gemini, DeepSeek, and Mistral. Drop-in replacement for api.openai.com/v1 — same paths, same JSON; pass any model id we list and we forward to the right provider, billed against your tokenroute balance.

Quickstart

Sign in, create an API key, top up some balance, and call any OpenAI-compatible client against https://api.tokenroute.io/v1.

curl https://api.tokenroute.io/v1/chat/completions \
  -H "Authorization: Bearer $TOKENROUTE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "Write a haiku about routing."}]
  }'

Authentication

Pass your key as a Bearer token. Keys start with sk-tr-. The platform secret is shown once at creation — store it as an env var; we don't keep a copy you can read.

Authorization: Bearer sk-tr-XXXXXXXXXXXXXXXXXXXXXXXX

Lost a key? Revoke it in /dashboard/keys and create a new one.

Models

Pass any of these strings as the model field. Each forwards to the corresponding upstream — your tokenroute key is the only credential you carry.

Model	Provider	Best for
gpt-4o, gpt-4o-mini	OpenAI	General-purpose, reliable
claude-opus-4-1, claude-sonnet-4-5, claude-haiku-4-5	Anthropic	Coding, long context, tool use
gemini-2.5-pro, gemini-2.0-flash	Google	Multilingual, video & image, very cheap flash tier
deepseek-chat, deepseek-reasoner	DeepSeek	Strong code & math at a fraction of frontier cost; R1 for reasoning
mistral-large, codestral	Mistral	European hosting; Codestral specialised for code completion

Live pricing for your account in /dashboard/models. More providers (国产 models, Cohere, Together) on the roadmap.

Examples

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    api_key="sk-tr-...",
    base_url="https://api.tokenroute.io/v1",
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-5",  # or gpt-4o, gemini-2.5-pro, deepseek-reasoner…
    messages=[{"role": "user", "content": "Prove there are infinitely many primes."}],
)
print(resp.choices[0].message.content)

Node (openai SDK)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.TOKENROUTE_API_KEY,
  baseURL: 'https://api.tokenroute.io/v1',
});

const resp = await client.chat.completions.create({
  model: 'gpt-4o-mini',  // or claude-sonnet-4-5, gemini-2.5-pro, etc.
  messages: [{ role: 'user', content: 'Hello' }],
});
console.log(resp.choices[0].message.content);

Streaming

curl https://api.tokenroute.io/v1/chat/completions \
  -H "Authorization: Bearer $TOKENROUTE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "stream": true,
    "messages": [{"role": "user", "content": "Count to 5."}]
  }'

Error codes

HTTP	type	Meaning
401	authentication_error	Missing / invalid / revoked key
402	insufficient_quota	Balance below the request's reservation ceiling
403	permission_error	Key's allowed_models doesn't include this model
429	rate_limit_error	RPM or TPM exceeded — see Retry-After header
502	upstream_error	Provider failed; reservation refunded

Rate limits

Each key can carry an RPM (requests/min) and TPM (tokens/min) cap. Defaults: unlimited. When you hit the wall, you get HTTP 429 with Retry-After in seconds and an error payload like:

{
  "error": {
    "message": "Rate limit exceeded (rpm). Retry after 12s.",
    "type": "rate_limit_error",
    "code": 429
  }
}

Need a higher limit? Email service@tokenroute.io.

OpenAI compatibility

We implement the chat-completions surface POST /v1/chat/completions and GET /v1/models. Streaming, function calling, multi-turn, system prompts — pass them through as-is. Embeddings and Assistants are not supported yet.

Anything missing or wrong here? service@tokenroute.io.

tokenroute.io

Dashboard Get a key