OpenAI-compatible API over OpenAI, Anthropic, Google Gemini, DeepSeek, and Mistral. Drop-in replacement for api.openai.com/v1 — same paths, same JSON; pass any model id we list and we forward to the right provider, billed against your tokenroute balance.
Sign in, create an API key, top up some balance, and call any OpenAI-compatible client against https://api.tokenroute.io/v1.
curl https://api.tokenroute.io/v1/chat/completions \
-H "Authorization: Bearer $TOKENROUTE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"messages": [{"role": "user", "content": "Write a haiku about routing."}]
}'Pass your key as a Bearer token. Keys start with sk-tr-. The platform secret is shown once at creation — store it as an env var; we don't keep a copy you can read.
Authorization: Bearer sk-tr-XXXXXXXXXXXXXXXXXXXXXXXXLost a key? Revoke it in /dashboard/keys and create a new one.
Pass any of these strings as the model field. Each forwards to the corresponding upstream — your tokenroute key is the only credential you carry.
| Model | Provider | Best for |
|---|---|---|
| gpt-4o, gpt-4o-mini | OpenAI | General-purpose, reliable |
| claude-opus-4-1, claude-sonnet-4-5, claude-haiku-4-5 | Anthropic | Coding, long context, tool use |
| gemini-2.5-pro, gemini-2.0-flash | Multilingual, video & image, very cheap flash tier | |
| deepseek-chat, deepseek-reasoner | DeepSeek | Strong code & math at a fraction of frontier cost; R1 for reasoning |
| mistral-large, codestral | Mistral | European hosting; Codestral specialised for code completion |
from openai import OpenAI
client = OpenAI(
api_key="sk-tr-...",
base_url="https://api.tokenroute.io/v1",
)
resp = client.chat.completions.create(
model="claude-sonnet-4-5", # or gpt-4o, gemini-2.5-pro, deepseek-reasoner…
messages=[{"role": "user", "content": "Prove there are infinitely many primes."}],
)
print(resp.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.TOKENROUTE_API_KEY,
baseURL: 'https://api.tokenroute.io/v1',
});
const resp = await client.chat.completions.create({
model: 'gpt-4o-mini', // or claude-sonnet-4-5, gemini-2.5-pro, etc.
messages: [{ role: 'user', content: 'Hello' }],
});
console.log(resp.choices[0].message.content);curl https://api.tokenroute.io/v1/chat/completions \
-H "Authorization: Bearer $TOKENROUTE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"stream": true,
"messages": [{"role": "user", "content": "Count to 5."}]
}'| HTTP | type | Meaning |
|---|---|---|
| 401 | authentication_error | Missing / invalid / revoked key |
| 402 | insufficient_quota | Balance below the request's reservation ceiling |
| 403 | permission_error | Key's allowed_models doesn't include this model |
| 429 | rate_limit_error | RPM or TPM exceeded — see Retry-After header |
| 502 | upstream_error | Provider failed; reservation refunded |
Each key can carry an RPM (requests/min) and TPM (tokens/min) cap. Defaults: unlimited. When you hit the wall, you get HTTP 429 with Retry-After in seconds and an error payload like:
{
"error": {
"message": "Rate limit exceeded (rpm). Retry after 12s.",
"type": "rate_limit_error",
"code": 429
}
}Need a higher limit? Email service@tokenroute.io.We implement the chat-completions surface POST /v1/chat/completions and GET /v1/models. Streaming, function calling, multi-turn, system prompts — pass them through as-is. Embeddings and Assistants are not supported yet.
Anything missing or wrong here? service@tokenroute.io.