All gateways operational

Drop-in compatibility. Point Claude Code or Codex at one endpoint. Cut inference costs without rewriting your stack.

One API key.
Every frontier model.

Built for developers, founders, and agencies routing production LLM traffic. Klint handles provider auth, cheapest-route selection, and metering — you ship instead of wiring.

base_url https://api.tryklint.com/v1
klint — bash
Status 200
Time 142ms
Route opus

p50 latency

68ms

Requests today

2.4M

Uptime

99.97%

Works with your stack

Compatible tools and models — not partnerships or sponsorships. Point your existing workflow at Klint.

VS Code
Terminal
GitHub
Anthropic logoAnthropic
OpenAI logoOpenAI
Google logoGoogle

Drop-in compatible

Point any OpenAI SDK at api.tryklint.com

Claude Code, Codex, and custom agents — swap the base URL, keep your code. Cut provider fragmentation on day one.

1curl https://api.tryklint.com/v1/chat/completions \ 2 -H "Authorization: Bearer $KLINT_API_KEY" \ 3 -H "Content-Type: application/json" \ 4 -d '{ 5 "model": "gpt-5.5-mini", 6 "messages": [{"role": "user", "content": "Hello!"}] 7 }'

Process

Production routing in under two minutes

Three steps from signup to routed inference — no stack rewrite required.

01

Select plan

Pick Pro, 5Max, or 20Max based on token volume. Agencies scale without renegotiating provider contracts.

02

Generate API key

Create production and staging keys from the dashboard. One key routes to every model.

03

Point Claude Code or Codex

Set base_url to api.tryklint.com/v1. Your existing SDK and agent configs keep working.

Observability

Every request logged, traced, and ready to inspect.

Debug routing decisions and inference spend without leaving your workflow.

Live request streamConnected
POST/v1/chat/completions200142ms
GET/v1/models20024ms
POST/v1/embeddings20089ms
POST/v1/chat/completions42912ms
POST/v1/chat/completions200138ms
POST/v1/messages200201ms

Models

One key across every major provider

GPT-5.5, Claude Opus, Grok 3, Gemini 2.5 Pro — route by model name, not provider account.

ModelProviderDescriptionWeightContext
GPT-5.5openai logoopenaiOpenAI flagship multimodal model1×256K
GPT-5.5 Miniopenai logoopenaiFast Codex-optimized variant0.3×128K
Claude Opusanthropic logoanthropicAnthropic flagship reasoning model1.4×200K
Claude Sonnetanthropic logoanthropicBalanced coding and reasoning for Claude Code0.9×200K
Grok 3xai logoxaixAI frontier model with real-time context1.1×131K
Gemini 2.5 Progoogle logogoogleGoogle advanced multimodal reasoning1.2×2000K
Llama 4 70Bmeta logometaOpen-weight model via Klint proxy0.4×128K
Mistral Large 3mistral logomistralMistral flagship model0.8×128K

Full catalog and pricing weights in model documentation.

Pricing

Cut inference costs at scale

Token-based billing for developers, founders, and agencies. Compare volume, routing tier, and cost-saving features side by side.

MonthlyAnnual −10%
FeaturePro5MaxMost popular20Max
Monthly price$39$89$199
Token volume500M tokens/mo1.2B tokens/mo5B tokens/mo
Claude Code + Codex
Enterprise routingStandardOptimizedOptimized
Agency / multi-projectTeam billingUnlimited keys
Token rollover30 days90 days90 days
Includes500M tokens per month1.2B tokens per month5B tokens per month
Claude Code + Codex routingEnterprise-grade routingCustom SLAs
Team billing and access controlsDedicated edge nodesDedicated support
Priority supportAdvanced rate limitsUnlimited API keys
Usage analyticsAudit logsEnterprise SSO

Pro

For teams running AI in production at scale.

Get started

5Max

High-volume workloads with optimized infrastructure scaling.

Get started

20Max

Maximum throughput for mission-critical AI systems.

Contact sales

Infrastructure

Global edge routing with sub-50ms gateway overhead.

Edge nodes
24
Regions
8
Avg overhead
<50ms
Uptime
99.97%

Cost routing

Cheapest healthy path, every request

Stop overpaying for direct provider access. Klint routes to the lowest-cost endpoint that meets your latency SLA.

GPT-5.5

OpenAI flagship multimodal model

Weight

1×

Context

256K

GPT-5.5 Mini

Fast Codex-optimized variant

Weight

0.3×

Context

128K

Claude Opus

Anthropic flagship reasoning model

Weight

1.4×

Context

200K

Claude Sonnet

Balanced coding and reasoning for Claude Code

Weight

0.9×

Context

200K

Teams

Built for production inference at scale

We cut provider billing from four dashboards to one. Klint paid for itself in the first week.

Priya Sharma

Staff Engineer

Agency clients run Claude Code and Codex through one key. No more per-project provider setup.

Jordan Ellis

Founder, AI Automation Agency

Latency matches direct calls. Cheapest-route selection actually shows up in our cost reports.

Zara Ahmed

CTO, Series A Startup

Swapped base_url in our agent stack in ten minutes. Inference spend dropped 18% month one.

Marcus Chen

Indie Developer

FAQ

Common questions about Klint

Still have questions? support@tryklint.com