tokenroute.io
DashboardGet a key
On this page
  • Quickstart
  • Authentication
  • Models
  • Examples
  • Error codes
  • Rate limits
  • OpenAI compatibility

Docs

OpenAI-compatible API over OpenAI, Anthropic, Google Gemini, DeepSeek, and Mistral. Drop-in replacement for api.openai.com/v1 — same paths, same JSON; pass any model id we list and we forward to the right provider, billed against your tokenroute balance.

Quickstart

Sign in, create an API key, top up some balance, and call any OpenAI-compatible client against https://api.tokenroute.io/v1.

curl https://api.tokenroute.io/v1/chat/completions \
  -H "Authorization: Bearer $TOKENROUTE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "Write a haiku about routing."}]
  }'

Authentication

Pass your key as a Bearer token. Keys start with sk-tr-. The platform secret is shown once at creation — store it as an env var; we don't keep a copy you can read.

Authorization: Bearer sk-tr-XXXXXXXXXXXXXXXXXXXXXXXX
Lost a key? Revoke it in /dashboard/keys and create a new one.

Models

Pass any of these strings as the model field. Each forwards to the corresponding upstream — your tokenroute key is the only credential you carry.

ModelProviderBest for
gpt-4o, gpt-4o-miniOpenAIGeneral-purpose, reliable
claude-opus-4-1, claude-sonnet-4-5, claude-haiku-4-5AnthropicCoding, long context, tool use
gemini-2.5-pro, gemini-2.0-flashGoogleMultilingual, video & image, very cheap flash tier
deepseek-chat, deepseek-reasonerDeepSeekStrong code & math at a fraction of frontier cost; R1 for reasoning
mistral-large, codestralMistralEuropean hosting; Codestral specialised for code completion
Live pricing for your account in /dashboard/models. More providers (国产 models, Cohere, Together) on the roadmap.

Examples

Python (openai SDK)
from openai import OpenAI

client = OpenAI(
    api_key="sk-tr-...",
    base_url="https://api.tokenroute.io/v1",
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-5",  # or gpt-4o, gemini-2.5-pro, deepseek-reasoner…
    messages=[{"role": "user", "content": "Prove there are infinitely many primes."}],
)
print(resp.choices[0].message.content)
Node (openai SDK)
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.TOKENROUTE_API_KEY,
  baseURL: 'https://api.tokenroute.io/v1',
});

const resp = await client.chat.completions.create({
  model: 'gpt-4o-mini',  // or claude-sonnet-4-5, gemini-2.5-pro, etc.
  messages: [{ role: 'user', content: 'Hello' }],
});
console.log(resp.choices[0].message.content);
Streaming
curl https://api.tokenroute.io/v1/chat/completions \
  -H "Authorization: Bearer $TOKENROUTE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "stream": true,
    "messages": [{"role": "user", "content": "Count to 5."}]
  }'

Error codes

HTTPtypeMeaning
401authentication_errorMissing / invalid / revoked key
402insufficient_quotaBalance below the request's reservation ceiling
403permission_errorKey's allowed_models doesn't include this model
429rate_limit_errorRPM or TPM exceeded — see Retry-After header
502upstream_errorProvider failed; reservation refunded

Rate limits

Each key can carry an RPM (requests/min) and TPM (tokens/min) cap. Defaults: unlimited. When you hit the wall, you get HTTP 429 with Retry-After in seconds and an error payload like:

{
  "error": {
    "message": "Rate limit exceeded (rpm). Retry after 12s.",
    "type": "rate_limit_error",
    "code": 429
  }
}
Need a higher limit? Email service@tokenroute.io.

OpenAI compatibility

We implement the chat-completions surface POST /v1/chat/completions and GET /v1/models. Streaming, function calling, multi-turn, system prompts — pass them through as-is. Embeddings and Assistants are not supported yet.

Anything missing or wrong here? service@tokenroute.io.